Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp828051ybz; Wed, 15 Apr 2020 20:08:02 -0700 (PDT) X-Google-Smtp-Source: APiQypLXgyVL1Pz6ngL1scbitptEp7rqFN6Ghf6kVfEvX0uRXmZDn7o88nyAP3/TqJkqSTZW3HX3 X-Received: by 2002:a17:907:210d:: with SMTP id qn13mr7955713ejb.376.1587006482171; Wed, 15 Apr 2020 20:08:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587006482; cv=none; d=google.com; s=arc-20160816; b=YzFXRQuoDxP7dpp6Z39o7Y5JVlmDj75o2aEjYdt+LUg1QDBUjdmb9m0qkjR3yt/OuD 081vzRhpytKg2KFK2bQMOZDDAescEIe42nQWkZVLI8fpiHVPAxZ192LMFW8Buuoifrtx uzSp8ruDw26I54me8n5Wura/hDy+rX4uoT31Xph5uDLERwfeTysxRhYXN/NCS9OPSoUX 5Ibox1ZjNP7zaRlukwVhQniahhr8XqE9iTEGDF9Lm2qTroLI+ydPhGyRyBglLDTp64NA c3anU7AfLbcH3a0IIXQoQ13pnvier+GNFLT0l6+I/Iu3uw7ZjFmMmxNWL3k0gNdHiXzF AQ7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=Ny/fDBkxdwTaLeh6f3A963P77GPd8EJiQrLAHfr4rm0=; b=gcJvsNrYII+iHkPpcHUq5EcUND0h1qhKJqxh2TTkemXl10wm6DkV+Kl6dF4wPUjsn+ YUOQdwD0qjHIQgMFsc/VKLiEa4t0e/kN835+3crwRwCPFpJPllwlJxvXYlmD9UnfbxQk Ck7rOOUJpLN3OsymDCQVycmo5K4WjgJ7+UAPh1DM2yF2xbPpZoGiF4pIbJOvNT+tCNtn YDz27hgUCHZrdi3cLVNc3LZ17vPk2pHwUOyXGjFMAa89Gw8/fT4wEgFE0AvHO1/8sqiB WGOeijRDDNPXAQ1lDa3ufFdozOytApWpxm7hV1b6qTzSb03GnseRiH5Touvne7qIY/7p rFOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="Jej/BoUU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n9si9547401edb.87.2020.04.15.20.07.38; Wed, 15 Apr 2020 20:08:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="Jej/BoUU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389135AbgDPDFk (ORCPT + 99 others); Wed, 15 Apr 2020 23:05:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S2388679AbgDPDFi (ORCPT ); Wed, 15 Apr 2020 23:05:38 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06EC2C061A0C for ; Wed, 15 Apr 2020 20:05:37 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id i22so1857023otp.12 for ; Wed, 15 Apr 2020 20:05:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=Ny/fDBkxdwTaLeh6f3A963P77GPd8EJiQrLAHfr4rm0=; b=Jej/BoUUORsFXG/s5u/iKEhJ07/whIHtGVhl6gY2G329P86S3wsYUiobQuEyufBtuS 4B7Vdk+bf/sob5IMhTGnR86uEQtd2I2qQlG7AhF97tSQIjOBuvJUFRxz/JFHnV+C6ama auWCfmMYFKc18E6KyxO8DAebV8v8y5ljVjvntmZ8/WgYAaHcJaxp2HSI7ep5WQnO9KmG GhGWow/SvKNJrZtnV9cqfAj4wJp4P5/7gBK/mudkZ725HMvosfNmlUuFPHReIqPAN8cz HnIp+CERVmmG9l4uKs2mXJds6dqbuTxxJ4aWPtPeaPSb5r+1pvC8CeRaJr1WFMshJ6EJ 4D/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=Ny/fDBkxdwTaLeh6f3A963P77GPd8EJiQrLAHfr4rm0=; b=gGjwFgE5VauxyXvVPzjLiyQFY5SjOqcMK8l8YRIYO66oN2bc+G5Sa3U7TVS3JFQAmK 9TWUkjlsjUrKP4FCgXftmiXI5AHjgVQdrXBM5RCpRTXrlY/4SZ4mAdh7ULHOb6WZixre rN3XTd8JDIQPoEk76/ed6q087AEw5Lf3qnSA9gNGaLgDKX3KjAZPJdMm4Kckq5hfI3jU u2Zd/j+0PDfsVAODUv3cfwbL4i+w2mKkotseOvKaUX8XJt/p3fa5tJkw8Sp4liXNcLjc KLrtlUVGBmMOgwiyLXJYJaHLlGeVKSCz8pwZbY9ZkcQfypbo7ekxdNEF9Tboz3WMmBAh 7fVg== X-Gm-Message-State: AGi0PuYs4jAIpodpwY28COjO2S3taz69p9Yoxphhe0CTCMPBWm6Xroae t8+bLjgBGAjA/hE8lbr6hL0hlQ== X-Received: by 2002:a05:6830:1054:: with SMTP id b20mr9869574otp.19.1587006336936; Wed, 15 Apr 2020 20:05:36 -0700 (PDT) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d84sm6867362oig.33.2020.04.15.20.05.35 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Wed, 15 Apr 2020 20:05:36 -0700 (PDT) Date: Wed, 15 Apr 2020 20:05:13 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Yang Shi cc: Hugh Dickins , syzbot , Andrew Morton , Linux Kernel Mailing List , Linux MM , syzkaller-bugs@googlegroups.com, Linus Torvalds Subject: Re: possible deadlock in shmem_uncharge In-Reply-To: Message-ID: References: <000000000000e5838c05a3152f53@google.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 15 Apr 2020, Yang Shi wrote: > On Wed, Apr 15, 2020 at 7:04 PM Hugh Dickins wrote: > > On Mon, 13 Apr 2020, Yang Shi wrote: > > > > > > It looks shmem_uncharge() is just called by __split_huge_page() and > > > collapse_file(). The collapse_file() has acquired xa_lock with irq > > > disabled before acquiring info->lock, so it is safe. > > > __split_huge_page() is called with holding xa_lock with irq enabled, > > > but lru_lock is acquired with irq disabled before acquiring xa_lock. > > > > > > So, it is unnecessary to acquire info->lock with irq disabled in > > > shmem_uncharge(). Can syzbot try the below patch? > > > > But I disagree with the patch below. You're right that IRQ-disabling > > here is unnecessary, given its two callers; but I'm not sure that we > > want it to look different from shmem_charge() and all other info->lock > > takers; and, more importantly, I don't see how removing the redundant > > IRQ-saving below could make it any less liable to deadlock. > > Yes, I realized the patch can't suppress the lockdep splat. But, > actually I didn't understand how this deadlock could happen because > info_lock is acquired with IRQ disabled before acquiring > user_shm_lock. So, interrupt can't come in at all if I didn't miss > anything. I think the story it's trying to tell is this (but, like most of us, I do find Mr Lockdep embarrassingly difficult to understand; and I'm not much good at drawing race diagrams either): CPU0 was in user_shm_unlock(), it's got shmlock_user_lock, then an interrupt comes in. It's an endio kind of interrupt, which goes off to test_clear_page_writeback(), which wants the xa_lock on i_pages. Meanwhile, CPU1 was doing some SysV SHM locking, it's got as far as shmem_lock(), it has acquired info->lock, and goes off to user_shm_lock() which wants shmlock_user_lock. But sadly, CPU2 is splitting a shmem THP, calling shmem_uncharge() that wants info->lock while outer level holds xa_lock on i_pages: with interrupts properly disabled, but that doesn't help. Now, that story doesn't quite hold up as a deadlock, because shmem doesn't use writeback tags; and (unless you set shmem_enabled "force") I don't think there's a way to get shmem THPs in SysV SHM (and are they hole-punchable? maybe through MADV_REMOVE); so it looks like we're talking about different inodes. But lockdep is right to report it, and more thought might arrive at a more convincing scenario. Anyway, easily fixed and best fixed. (But now I think my patch must wait until tomorrow.) Hugh