Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp3499165ybd; Fri, 28 Jun 2019 09:37:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqxbC1UZWP6tmR7HEo/JR45sDxapJ4e8pDDqtro3JwtuYAgPNRMGHnzXonWdtxjiUK9UDaN+ X-Received: by 2002:a17:902:be12:: with SMTP id r18mr3570787pls.341.1561739873787; Fri, 28 Jun 2019 09:37:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561739873; cv=none; d=google.com; s=arc-20160816; b=q1vzkehpKmBVxo81y8tZsqVGT6A5ywBg15aUG3q8qdEH0L1xxXCiVxSNCzRzf1B77H nTnzXv2Z3hgWGF0ZCrL0A9liiosaWN/mMN0XC3aU/IDk95rrYevjlH2YdzGYRI2L5qj9 jsWocv/I42yiQbDaDg/OwzqLk9Zbs6+KUzOXDf4o0wrZpcO7XBjzsyFO2p7qSPksm9tb su1xMntLfg8Q1L2jju4ll/ysnSEf8S6q/TI7Kr+/P4RhyLIpqZXZGT6NWgl675rmrtDx kUti5e5Cn5xE+Rnees2SUE0hgcJrkQtRlA4htyHjGP/bIAbjw/eofqXTyDJFZ2HMJOqf gFbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=YfwGz5sOrImHIjcEF62FXobE7G43pUuN/X5aC3DDmls=; b=PsJuAz2X8SQ1sQLm+1HT6OK0GHjAXbpisg7pp2UlJWjHG00gttdHg0yvn7s2BH94em FeDExHHr9sUPRIdQkvoVt0MuJzdMqlp+0zxXBlSG37uJO6Z+8AyyiJYMaklP9rt9W0/a il7ZIFZ9CRIDqbxlLU4WC/Q03IFPt8xaBzmQEpw48XEHk4V7goiJu1rscz2oaAU9F+WA gSGZ5hJu1YTCjBW0zHKzEx6sD7kfbMzsMnsKqZWYWgWeD3ACGGFcOx68F/2ueLh6TyNI Qu/kYGhvwgHMWxDiYJ4RLuY8unzkerQ1nIREX9D2cQn6ilufBf0DV9y9Pwm7pktyz4i3 4skg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=PfFFq5Ci; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s16si2487594pgh.580.2019.06.28.09.37.38; Fri, 28 Jun 2019 09:37:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=PfFFq5Ci; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726902AbfF1QhX (ORCPT + 99 others); Fri, 28 Jun 2019 12:37:23 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:59238 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726605AbfF1QhX (ORCPT ); Fri, 28 Jun 2019 12:37:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YfwGz5sOrImHIjcEF62FXobE7G43pUuN/X5aC3DDmls=; b=PfFFq5CiBic4IjGw2dl6jXqTD vowp+ZjV1mlIL0kYmw9pueRCoT/0bSZQnb5IdgR9EmSRbaHIqxJkiiej2vV9Hg5j4YAEImOdGngr8 BAHWmeFf6tTJa1fgBiaLQfrp5jelVvJuz+lZGqmHhLTOvNbBePcVTNF/fhM0xIYSWsgyxGeo3BlRS 63UB5Uisoxvd7AldZFqcxcsCaFUt13z2IU8o9DIcyyYRBcvSywP9mHc0V+FiENQcA6szowYP/tq3p UYJt8O0XntbXvcj0nx4KoMsNdW8gqItattZ0PHBGWXzg2nbJ9qxuPYNRh+cvcAtGvdygJtt7Ou1bN rS7GAkEwA==; Received: from willy by bombadil.infradead.org with local (Exim 4.92 #3 (Red Hat Linux)) id 1hgtsH-0004uV-Ck; Fri, 28 Jun 2019 16:37:21 +0000 Date: Fri, 28 Jun 2019 09:37:21 -0700 From: Matthew Wilcox To: Dan Williams Cc: linux-nvdimm , Jan Kara , stable , Robert Barror , Seema Pandit , linux-fsdevel , Linux Kernel Mailing List Subject: Re: [PATCH] filesystem-dax: Disable PMD support Message-ID: <20190628163721.GC4286@bombadil.infradead.org> References: <156159454541.2964018.7466991316059381921.stgit@dwillia2-desk3.amr.corp.intel.com> <20190627123415.GA4286@bombadil.infradead.org> <20190627195948.GB4286@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.4 (2019-03-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 27, 2019 at 07:39:37PM -0700, Dan Williams wrote: > On Thu, Jun 27, 2019 at 12:59 PM Matthew Wilcox wrote: > > On Thu, Jun 27, 2019 at 12:09:29PM -0700, Dan Williams wrote: > > > > This bug feels like we failed to unlock, or unlocked the wrong entry > > > > and this hunk in the bisected commit looks suspect to me. Why do we > > > > still need to drop the lock now that the radix_tree_preload() calls > > > > are gone? > > > > > > Nevermind, unmapp_mapping_pages() takes a sleeping lock, but then I > > > wonder why we don't restart the lookup like the old implementation. > > > > If something can remove a locked entry, then that would seem like the > > real bug. Might be worth inserting a lookup there to make sure that it > > hasn't happened, I suppose? > > Nope, added a check, we do in fact get the same locked entry back > after dropping the lock. Okay, good, glad to have ruled that out. > The deadlock revolves around the mmap_sem. One thread holds it for > read and then gets stuck indefinitely in get_unlocked_entry(). Once > that happens another rocksdb thread tries to mmap and gets stuck > trying to take the mmap_sem for write. Then all new readers, including > ps and top that try to access a remote vma, then get queued behind > that write. > > It could also be the case that we're missing a wake up. That was the conclusion I came to; that one thread holding the mmap sem for read isn't being woken up when it should be. Just need to find it ... obviously it's something to do with the PMD entries.