Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1586508pxb; Fri, 20 Nov 2020 13:20:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJzmEOOWrcHAEasb3TIRkDGmpLh7ajLaPrG1EVEW4pLjxg8VYipSXosh/A2t1BjRwcK54g8p X-Received: by 2002:a17:906:3413:: with SMTP id c19mr33572752ejb.421.1605907241221; Fri, 20 Nov 2020 13:20:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605907241; cv=none; d=google.com; s=arc-20160816; b=fKCuZMoN8gWPYxx/zx2ykcMtf2wMX+tAz2/CD6M4eSbrt8jJZ24pnbN/O392DvWZUu 82dnQ6tTuPWwscx9CXTXt+p9cx+Iz/rNgWTlKEulojgyw4+AuQa/Ff70CPxdQjCu4rlx u0IIlFXCh/GREGqt+ajJH5BL7l7pCANPWFgC3WiS2dPiWWqJGxZNVp7q3I07/eX+o+aR n1+GQcAVbvdAYUo/ltxgfy2OrUzQn+H6JVsUvSoTxT77LvFgyDoQNgzwL2GP3TiWcZh0 nCWLSXPu5DcCJhkt4xocZcHs7hL2f9CLAzeBlXnh2QUplfv0XmS9yByZ4UFC2/MC2YN4 onQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=PSvZ/V3hJPQm5eQAOv/bPZGbNpjwCHr1HibjUj9KFYU=; b=U1cSr9c/1GSUeza4r1l/7s4eF2kYQVeshduh4c6c3jxGPfeFmKJmZ1J/3s167vmVCU sgV12DSX5nGl7Hs+/ttlEgiqBig56vjLia13I2p+FpFh08sOGtTmAiL4/Gy+APeiyMT8 uvC6qTAE6eiixwqGqApvQ6o4zVAWU3gu1vpJByzNZrR1KsmdIUCRyC4S1jP4c2cEkLaL fNc22DHa75sy0jnu50o5Ac/9/DqmXS6cgSfmBzJvpSBLbFm9eLaM6qXOkHnloY7bbg7K ImfePsDyl8Z8atDTWCgNZ/ehCfbQ4I8HmY8XGJh92kFaczG1pz7e+mgU7tZ/9JCSHs6f jy5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=kf1qDma2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y72si2514925ede.436.2020.11.20.13.20.14; Fri, 20 Nov 2020 13:20:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=kf1qDma2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728431AbgKTVRV (ORCPT + 99 others); Fri, 20 Nov 2020 16:17:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727961AbgKTVRV (ORCPT ); Fri, 20 Nov 2020 16:17:21 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E65E4C061A04 for ; Fri, 20 Nov 2020 13:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=PSvZ/V3hJPQm5eQAOv/bPZGbNpjwCHr1HibjUj9KFYU=; b=kf1qDma2luPsYIqJ/FUFFRoFIn m1/WSwGdytFgqwD+TrR++qy82KIODyPULbRDB7bys5Z/AjxMYurrjtxzsuWiaEJ3TbjSPItms1h2M +Cxhi80yPG/35YLHTrKWq5onJe+TUTjI8uTdKJmC8b7NY+BRRD4pIRF1P/x8t5nqRLLBaV14/vWQq +CCRKQi5I6uYosNQa9lBZu4F41yfyvtCJLHZZ5sQnMbg8uVQNLQHCVAAasto98KULnaGh/CirnpqL w0LuYBGi4a68M3m/u50kScjU25A0pcl6IBR/vpG3Sc/FwlsFJdEmo+Rr/xQnwt5BJeDzMEUl+rWtJ YXskBatQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kgDmF-0000k7-PU; Fri, 20 Nov 2020 21:17:07 +0000 Date: Fri, 20 Nov 2020 21:17:07 +0000 From: Matthew Wilcox To: David Hildenbrand Cc: Pavel Tatashin , linux-mm , Andrew Morton , Vlastimil Babka , LKML , Michal Hocko , Oscar Salvador , Dan Williams , Sasha Levin , Tyler Hicks , Joonsoo Kim , sthemmin@microsoft.com Subject: Re: Pinning ZONE_MOVABLE pages Message-ID: <20201120211707.GC4327@casper.infradead.org> References: <9452B231-23F3-48F5-A0E2-D6C5603F87F1@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9452B231-23F3-48F5-A0E2-D6C5603F87F1@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 20, 2020 at 09:59:24PM +0100, David Hildenbrand wrote: > > > Am 20.11.2020 um 21:28 schrieb Pavel Tatashin : > > > > Recently, I encountered a hang that is happening during memory hot > > remove operation. It turns out that the hang is caused by pinned user > > pages in ZONE_MOVABLE. > > > > Kernel expects that all pages in ZONE_MOVABLE can be migrated, but > > this is not the case if a user applications such as through dpdk > > libraries pinned them via vfio dma map. Kernel keeps trying to > > hot-remove them, but refcnt never gets to zero, so we are looping > > until the hardware watchdog kicks in. > > > > We cannot do dma unmaps before hot-remove, because hot-remove is a > > slow operation, and we have thousands for network flows handled by > > dpdk that we just cannot suspend for the duration of hot-remove > > operation. > > > > Hi! > > It‘s a known problem also for VMs using vfio. I thought about this some while ago an came to the same conclusion: before performing long-term pinnings, we have to migrate pages off the movable zone. After that, it‘s too late. We can't, though. VMs using vfio pin their entire address space (right?) so we end up with basically all of the !MOVABLE memory used for VMs and the MOVABLE memory goes unused (I'm thinking about the case of a machine which only hosts VMs and has nothing else to do with its memory). In that case, the sysadmin is going to reconfigure ZONE_MOVABLE away, and now we just don't have any ZONE_MOVABLE. So what's the point? ZONE_MOVABLE can also be pinned by mlock() and other such system calls. The kernel needs to understand that ZONE_MOVABLE memory may not actually be movable, and skip the unmovable stuff.