Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp2776823ybd; Thu, 27 Jun 2019 19:41:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqyUHiy5siEg+PlDDZtFBoAH5Baw3exSTvTFa6k+1wTy3xd4oC83UCxhtAv8eXVtnqEN6fHc X-Received: by 2002:a17:90a:ad89:: with SMTP id s9mr10142863pjq.41.1561689666405; Thu, 27 Jun 2019 19:41:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561689666; cv=none; d=google.com; s=arc-20160816; b=iOv8Fj5swqceFjRTzdh9SSiDvdMHkffZj8m8cUd7RfWiHWFGzs3ZH+ulI0XvQ8Ak6P 3Cku2youwPXnYBDyxwXx5+PaEAi0fAMjxrUfBhsROyUVHinkFhr9Uma4tZ9jsNliCBuO l/aE8l1zNCWbAkn87QnpipuO259rdp3TY9xy8g1CxY2KqKFsdXo9F9rMzbYbyF5LiC3e htJXXEgDWTKdisdH4ilC0HCmxPfKqZsTt1rbcN57ghPkOwZiqhZI7vpYzlt/+77cn2ve fYz+NMmAwFk64H5w+uUWR4J8Tg+Ea4XlwcdZmxdmysetIGxKaUHcTJyQ8th5wxYLPwr3 lVaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=i+w5fn2Y32/L39vIbDWdR2CfVz49VqoI4F4CB/r+uKs=; b=mrzzERfangUH2CIePh1uc3cye/LlsoHUwwx7XgPZxrikw9anUGA7R4QGBDGXJvNA9f x+bQknvRo2zyF0FhCe0kALEjy80Fo0kfx9qFqN3mZ8Dp54melo/WLjIkXpLB8aley6Rw Q06xN5NODMR/3h1Q8ii+RtAz9X+vOnbnZ7g78xSFNzOkfUcoFnydpEZHD4bGktkmqMEG zTAYOAprjqWXUzJKKzuEelSefJ4D/UZbzfN68oSK+iBORCXq/C9XLmQGGFOX32NM5nBl lprzgZKHdK5pLVds9l0J3aSS7mWEVmAWomhjrtF30rST1ebCoC8Ne7UDxJh/Lkq9EpUp 1UAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=2BroReMx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k33si758722pld.359.2019.06.27.19.40.50; Thu, 27 Jun 2019 19:41:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=2BroReMx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727055AbfF1Cjv (ORCPT + 99 others); Thu, 27 Jun 2019 22:39:51 -0400 Received: from mail-ot1-f66.google.com ([209.85.210.66]:43297 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727041AbfF1Cju (ORCPT ); Thu, 27 Jun 2019 22:39:50 -0400 Received: by mail-ot1-f66.google.com with SMTP id i8so4447464oth.10 for ; Thu, 27 Jun 2019 19:39:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i+w5fn2Y32/L39vIbDWdR2CfVz49VqoI4F4CB/r+uKs=; b=2BroReMxSyl7fumvZHFYCsXog9so/PwacI6uZo45fszzwoTUhDjTDVIC3dmDgkvhjI 1vPIrM+hSnKeAwrkzbGzAjTEPTTgmqOwCVSZQUdIDrvAxKA2cMV8Dr6GunOjfoKKRYiR i8n913ea5p94gaxe2NpjIpJwUJ2qhzkQ+wHLtf//YGUdDQeqhjuL7sHzp6EjAr//ec01 XTKGHBCDcdPjUSb8oFPiQDfaUdsVI1ufjKYoVyP6pZsvPych6kooqkVWEaGAH73GpfeW Q64A9UAnw0STfqhTNC283nlq9ZOnnmCyzDAjw3roZzNRDyNGYbLpCBliRHdGmg3LI7kY Wv/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i+w5fn2Y32/L39vIbDWdR2CfVz49VqoI4F4CB/r+uKs=; b=PGL9K+CpLFQFnm552+XLE9vglxVQkP8jXJUugp0rDVpd6FwMYz/KL+C5fgJSKDYz+A sH2m7Ybl/i/wyATtEEARrodXzbn+Xf57wtOgT4OuBOSEAapNzPzsTBI/WFRA1kLHZdVU gx74oG7S6AGLdLWDnnE/T53on8Hjw87D90DmNVtOhSY5pUM+JPOgIwDNmUuKYLtMhaFd gXlaU2HCpARIpaPKqkYrR9gApM+n2jSWhaowfDunDyVR0TsvHOelj8jFRn7z35yabkp0 LoMIlIixVYjblmPJ9+qJd2o9Vi+DMAQxzLYXVI6jGrt4MeS8ExSYLUyiXDtanzP2m/cB bVyA== X-Gm-Message-State: APjAAAWSuSI9BuRhN0+AG+uA9a8AndyqD6hzpgAE+LmHmdYH9ks9ieKx NmmcB1JRCbalgbxosuk6aozctLtGRUTw/g1Zb5su7BpdYkg= X-Received: by 2002:a9d:7b48:: with SMTP id f8mr5967248oto.207.1561689589382; Thu, 27 Jun 2019 19:39:49 -0700 (PDT) MIME-Version: 1.0 References: <156159454541.2964018.7466991316059381921.stgit@dwillia2-desk3.amr.corp.intel.com> <20190627123415.GA4286@bombadil.infradead.org> <20190627195948.GB4286@bombadil.infradead.org> In-Reply-To: <20190627195948.GB4286@bombadil.infradead.org> From: Dan Williams Date: Thu, 27 Jun 2019 19:39:37 -0700 Message-ID: Subject: Re: [PATCH] filesystem-dax: Disable PMD support To: Matthew Wilcox Cc: linux-nvdimm , Jan Kara , stable , Robert Barror , Seema Pandit , linux-fsdevel , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 27, 2019 at 12:59 PM Matthew Wilcox wrote: > > On Thu, Jun 27, 2019 at 12:09:29PM -0700, Dan Williams wrote: > > > This bug feels like we failed to unlock, or unlocked the wrong entry > > > and this hunk in the bisected commit looks suspect to me. Why do we > > > still need to drop the lock now that the radix_tree_preload() calls > > > are gone? > > > > Nevermind, unmapp_mapping_pages() takes a sleeping lock, but then I > > wonder why we don't restart the lookup like the old implementation. > > We have the entry locked: > > /* > * Make sure 'entry' remains valid while we drop > * the i_pages lock. > */ > dax_lock_entry(xas, entry); > > /* > * Besides huge zero pages the only other thing that gets > * downgraded are empty entries which don't need to be > * unmapped. > */ > if (dax_is_zero_entry(entry)) { > xas_unlock_irq(xas); > unmap_mapping_pages(mapping, > xas->xa_index & ~PG_PMD_COLOUR, > PG_PMD_NR, false); > xas_reset(xas); > xas_lock_irq(xas); > } > > If something can remove a locked entry, then that would seem like the > real bug. Might be worth inserting a lookup there to make sure that it > hasn't happened, I suppose? Nope, added a check, we do in fact get the same locked entry back after dropping the lock. The deadlock revolves around the mmap_sem. One thread holds it for read and then gets stuck indefinitely in get_unlocked_entry(). Once that happens another rocksdb thread tries to mmap and gets stuck trying to take the mmap_sem for write. Then all new readers, including ps and top that try to access a remote vma, then get queued behind that write. It could also be the case that we're missing a wake up.