Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp768577pxf; Wed, 10 Mar 2021 17:51:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJwVpfYZRQi7zuWSfkdV6wlIn/GMpvUlaVnhwEV3Ok1TWn/SlteNTZ6f3kN/CUmqxHkqn6VD X-Received: by 2002:a17:906:24c7:: with SMTP id f7mr691655ejb.473.1615427508116; Wed, 10 Mar 2021 17:51:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615427508; cv=none; d=google.com; s=arc-20160816; b=O4U+aDh5egLNA05LfjOPdUCbLSE/LQxWJ2Rp1AK+pjuvKQPvfu2O8JE3LcrFAu7vwz M2maBIWks4Eak06U/jDB0zMhgep1LiBqaFzcVTGcCklgjDrJdA/DFm6poA6bqKUWC2IZ wjVNk5mkCK90I/4fIF4Gddb4lbkN3Gm7zRka9b9DEimTCbRI8KCFGV1mjHWeb6HU2YOg eYu47PnoEa5UHooaqkGxsVlgwKqcyJkCWR09U8lloP2SlLal+uMOrQ0+/dpvAwVogT2+ 92LJaZtS5vI5poQJuoComQCrJGjs/dJVEezwAbxCOmKfIB4Uf1fGvXngux7SScY/eQpX z3+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=VnK/Ou5++5aAduT+vgSPu3tbQk3cRlxblRdLjBs1qkI=; b=PSDeAMddxqPrSMsB6AO8dF/qzOmvw8nRZgm/wwK5NotEbIZfCcPUVuIgITy9jGYL+k 0PWoarBAmQrzL5CNrr2gXUNoI/PU7ZWB+XF6tOLGnc01eGmY+Dr4dX3P0rGaJ7jrlh/D qB7gb+FmsTjVFdJcuCJcmNgc6esGcVkgAPMgPad90NliGxmOApddX8rPsCihHpZKDt6P 3MczKP40y14UgOvoI0np4Yf+lk4J3lLHrUfES1oCAGU9FLIViLlyo61ME8io1Nj7Zsbv gDD6YlqrGzz3yL/XABOgsWccqBqlClPfPB/TuzNgBgXgDtyTHapbNBcLhhqJR7eW4Q+C O/iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=jx8kG6wE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r18si701175edw.69.2021.03.10.17.51.26; Wed, 10 Mar 2021 17:51:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=jx8kG6wE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229469AbhCKBuW (ORCPT + 99 others); Wed, 10 Mar 2021 20:50:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229546AbhCKBuQ (ORCPT ); Wed, 10 Mar 2021 20:50:16 -0500 Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38BD1C061574 for ; Wed, 10 Mar 2021 17:50:16 -0800 (PST) Received: by mail-qk1-x733.google.com with SMTP id l132so19127330qke.7 for ; Wed, 10 Mar 2021 17:50:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=VnK/Ou5++5aAduT+vgSPu3tbQk3cRlxblRdLjBs1qkI=; b=jx8kG6wECMMTmF45eZ7ABrPEJNva0pZbs5+4jzarLFhqMQjLauHiGgggpV0YZCU3km 5oDdr8oblDrFKStGvYjVBGUm5zlM8m7g/So+/LHiNjuPyICK7VO223eO7JeyXrZKhf2f jr1VK2/KysQwLoSVPYm59CMwVKJCXtXR8/LzyjCFdezwWjj01QKqq7oUcke7dZHOPGD8 r/jfAV4zazGcKkH4mpqgbmwxg+IFFrKOKmvaNI8jzKJx4ZwtO8ct87XtTzQZR1pfha18 leRNKp8JXnTzRDcIlMt7jkvDNvfoIuHw1y73qUOQ4Q8M4AaodrOTBP/cJGht2P4Ikyh/ 97uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=VnK/Ou5++5aAduT+vgSPu3tbQk3cRlxblRdLjBs1qkI=; b=NsQHQwpX2amsHCdmUKYlGmfWZsoc+MYI0QF8nx+jNeI1hMLrW3o48e5ru8OsS9yUv9 Wtej9OzfLewxzGZZbGBHvpbFGQikUV9Qom0N9kSePujewShGz1Epn5O4L5Y0JN3+M4V9 4+lPLByz6xhjDj6AeKEL8abBwgf21TB8qsqkTLklCIRgY1VR7T1dCb3ul3dadEdoB6Pk zhIy6hMaODrPeohNxu7des8vvprJQqoUANxIPG0nKkJwjfXCZQ8a9+Er/rPqj4anUgIZ 3//2fYLYYnk9RqQQUd1zTqLTgd7R3OyuE33s7Niz2pYCmbwSeR6zlcNUtg3ZPT9i3sXF Nk0w== X-Gm-Message-State: AOAM5313KXQTTAIJ/syjodwN3a+BwYI8aDU1JJWCiKJ3H64747s4wOsT ofZW/zPxrFBXAJ0j6fyzZsl3hg== X-Received: by 2002:a37:a38e:: with SMTP id m136mr5581896qke.250.1615427415187; Wed, 10 Mar 2021 17:50:15 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-115-133.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.115.133]) by smtp.gmail.com with ESMTPSA id v4sm798670qte.18.2021.03.10.17.50.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 17:50:14 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1lKASr-00B0Gx-CD; Wed, 10 Mar 2021 21:50:13 -0400 Date: Wed, 10 Mar 2021 21:50:13 -0400 From: Jason Gunthorpe To: Sean Christopherson Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, David Rientjes , Ben Gardon , Michal Hocko , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrea Arcangeli , Johannes Weiner , Dimitri Sivanich Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired with range_start() Message-ID: <20210311015013.GS444867@ziepe.ca> References: <20210310213117.1444147-1-seanjc@google.com> <20210311002807.GQ444867@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 10, 2021 at 05:20:01PM -0800, Sean Christopherson wrote: > > Which I believe is fatal to kvm? These notifiers certainly do not only > > happen at process exit. > > My point about the process dying is that the existing bug that causes > mmu_notifier_count to become imbalanced is benign only because the process is > being killed, and thus KVM will stop running its vCPUs. Are you saying we only call non-blocking invalidate during a process exit event?? > > So, both of the remaining _end users become corrupted with this patch! > > I don't follow. mn_hlist_invalidate_range_start() iterates over all > notifiers, even if a notifier earlier in the chain failed. How will > KVM become imbalanced? Er, ok, that got left in a weird way. There is another "bug" where end is not supposed to be called if the start failed. > The existing _end users never fail their _start. If KVM started failing its > start, then yes, it could get corrupted. Well, maybe that is the way out of this now. If we don't permit a start to fail if there is an end then we have no problem to unwind it as we can continue to call everything. This can't be backported too far though, the itree notifier conversions are what made the WARN_ON safe today. Something very approximately like this is closer to my preference: diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 61ee40ed804ee5..6d5cd20f81dadc 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -501,10 +501,25 @@ static int mn_hlist_invalidate_range_start( ""); WARN_ON(mmu_notifier_range_blockable(range) || _ret != -EAGAIN); + /* + * We call all the notifiers on any EAGAIN, + * there is no way for a notifier to know if + * its start method failed, thus a start that + * does EAGAIN can't also do end. + */ + WARN_ON(ops->invalidate_range_end); ret = _ret; } } } + + if (ret) { + /* Must be non-blocking to get here*/ + hlist_for_each_entry_rcu (subscription, &subscriptions->list, + hlist, srcu_read_lock_held(&srcu)) + subscription->ops->invalidate_range_end(subscription, + range); + } srcu_read_unlock(&srcu, id); return ret;