Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1770555rwb; Tue, 27 Sep 2022 18:55:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7o9twGbodwN+M5EbniOAQ6eAfGKL1FXeHlrFiqCPUmdu9PnDWigmWM3z+2MAFPJxw/eLg/ X-Received: by 2002:a17:902:cec8:b0:178:6c66:cf16 with SMTP id d8-20020a170902cec800b001786c66cf16mr30608453plg.121.1664330145261; Tue, 27 Sep 2022 18:55:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664330145; cv=none; d=google.com; s=arc-20160816; b=JACUFUxyD40uyhPXWWDTX6c3QCUCd9ieg9eXP0fI01JtRYPUjB4vd5XjrZFX9hB2CU FuUFdcnr50RYPEqe9i22KyxbQSfT/w0TyTmDQA+bSAG56msIbJCN7WqoV6LbTvozMB9B Vw/aTV8pVet/rnZaB6ilbaCNqDwGX1HwVjRGRJlhL/GLFBogL2W82mLlmi26OQ+bpVOL AlklryDNuYGVSSrjK8mHcrSXRbFafjU+b9R5SqDOl83wFrwrKgJIw9OXsYc6rujfI6tl yloLzU+yVFI9pIgxowjAp1BAQGfR96Iqxb2qQFonBQ6FYszYH2pkr0M+sIWEXE09gcCr vLRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:dkim-signature; bh=zxRbku8aoXRZ99m3ZcQYGlFUV+74X2ZE7Sye8QJj6fo=; b=DnCPHNe5Nk12F9RxlwspcXD3lxmTKXYFar/LKPi0I2N+/JMNT0FsRUgEoz9Jg1V5nl EdqBQ17stU0aOMlnLzgjpyLw7duQM1DjA7dFtEHrhJQ13t9s8PBLN1tit/a+q20SKQHX pc9aA6nbaQ32oIdBd5CTtm3Ym1IzeIjKwMQItWwatNxHw06a/x8VpamcAWoSGdRpw6kw +UhEC84fSD1GOQCBxZENxCjIrz+jBF2NiLDqCDeHMl51zLmSfb3hcINS2g8/uJw8ruci Bk6Pe2ZFUq8gKr3uLIuGI/3n51o4kVdJfKSJqauM/ZjzKG+1WHOmJ0tcDrzPK9jFmPvu ONeQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YG9F2XXK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lw9-20020a17090b180900b00200b7ca2383si526421pjb.151.2022.09.27.18.55.28; Tue, 27 Sep 2022 18:55:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YG9F2XXK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232590AbiI1BmH (ORCPT + 99 others); Tue, 27 Sep 2022 21:42:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231684AbiI1BmG (ORCPT ); Tue, 27 Sep 2022 21:42:06 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 383B61DCC76 for ; Tue, 27 Sep 2022 18:42:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664329325; x=1695865325; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=VdnIAay/pNa3J+uoxCjvYQR/wUDLr3pNkvZzDguZ7Fk=; b=YG9F2XXKZx+wmMvHKrFZJIXEYE/2X9Twcwrl7MVUDNvaCYB9aQhRG1zP HSh5Pq3/kfAwY56Lcj1egL0c1R3FYEbKMl7NOLRit+Oe1hxBSERpoSpLq qzZyG2FwqIPXCYZtfWBjVZs4S7rP0PyWEOvwSlWf702W6Fqdm897QgsDm 34bdcaaazNJpiPhpGIi1/FBGj5eY4HRr6CPUACos6YCABCkY0fl6JQTDH /TUWQ5rAeZVQUek0ihAuxhHGaHLfNAenxhgslb4zviTIxBpEhz3mcu6GJ qtu0KveXtryomVOlKa/Uh/Anj+Y4BU/tTDTlQwyVLt570aCIqxNVnUK2r g==; X-IronPort-AV: E=McAfee;i="6500,9779,10483"; a="327841436" X-IronPort-AV: E=Sophos;i="5.93,350,1654585200"; d="scan'208";a="327841436" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2022 18:42:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10483"; a="654939615" X-IronPort-AV: E=Sophos;i="5.93,350,1654585200"; d="scan'208";a="654939615" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2022 18:42:02 -0700 From: "Huang, Ying" To: Alistair Popple Cc: Yang Shi , John Hubbard , , , Andrew Morton , Zi Yan , Baolin Wang , Oscar Salvador , "Matthew Wilcox" Subject: Re: [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap() and _move() In-Reply-To: <87pmfgjnpj.fsf@nvdebian.thelocal> (Alistair Popple's message of "Wed, 28 Sep 2022 10:59:08 +1000") References: <20220921060616.73086-1-ying.huang@intel.com> <20220921060616.73086-3-ying.huang@intel.com> <87o7v2lbn4.fsf@nvdebian.thelocal> <87fsgdllmb.fsf@nvdebian.thelocal> <87ill937qe.fsf@yhuang6-desk2.ccr.corp.intel.com> <46807002-c42c-1232-0938-5b48050171ee@nvidia.com> <87pmfgjnpj.fsf@nvdebian.thelocal> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Date: Wed, 28 Sep 2022 09:41:28 +0800 Message-ID: <87czbg2s3b.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Alistair Popple writes: > Yang Shi writes: > >> On Tue, Sep 27, 2022 at 1:35 PM John Hubbard wrote: >>> >>> On 9/26/22 18:51, Huang, Ying wrote: >>> >>> But there might be other cases which may incur deadlock, for example, >>> >>> filesystem writeback IIUC. Some filesystems may lock a bunch of pages >>> >>> then write them back in a batch. The same pages may be on the >>> >>> migration list and they are also dirty and seen by writeback. I'm not >>> >>> sure whether I miss something that could prevent such a deadlock from >>> >>> happening. >>> >> >>> >> I'm not overly familiar with that area but I would assume any filesystem >>> >> code doing this would already have to deal with deadlock potential. >>> > >>> > Thank you very much for pointing this out. I think the deadlock is a >>> > real issue. Anyway, we shouldn't forbid other places in kernel to lock >>> > 2 pages at the same time. >>> > >>> >>> I also agree that we cannot make any rules such as "do not lock > 1 page >>> at the same time, elsewhere in the kernel", because it is already >>> happening, for example in page-writeback.c, which locks PAGEVEC_SIZE >>> (15) pages per batch [1]. > > That's not really the case though. The inner loop of write_cache_page() > only ever locks one page at a time, either directly via the > unlock_page() on L2338 (those goto's are amazing) or indirectly via > (*writepage)() on L2359. > > So there's no deadlock potential there because unlocking any previously > locked page(s) doesn't depend on obtaining the lock for another page. > Unless I've missed something? Yes. This is my understanding too after checking ext4_writepage(). Best Regards, Huang, Ying >>> The only deadlock prevention convention that I see is the convention of >>> locking the pages in order of ascending address. That only helps if >>> everything does it that way, and migrate code definitely does not. >>> However...I thought that up until now, at least, the migrate code relied >>> on trylock (which can fail, and so migration can fail, too), to avoid >>> deadlock. Is that changing somehow, I didn't see it? >> >> The trylock is used by async mode which does try to avoid blocking. >> But sync mode does use lock. The current implementation of migration >> does migrate one page at a time, so it is not a problem. >> >>> >>> >>> [1] https://elixir.bootlin.com/linux/latest/source/mm/page-writeback.c#L2296 >>> >>> thanks, >>> >>> -- >>> John Hubbard >>> NVIDIA >>> >>> > The simplest solution is to batch page migration only if mode == >>> > MIGRATE_ASYNC. Then we may consider to fall back to non-batch mode if >>> > mode != MIGRATE_ASYNC and trylock page fails. >>> > >>> >>>