Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp726648pxf; Wed, 10 Mar 2021 16:32:07 -0800 (PST) X-Google-Smtp-Source: ABdhPJxjLgbfo5rqEx+Pxy/ouB5s0QrTnu0IGfESONjne/TgEHoYpMT0Deoq+VC5ukiSpS/6Jyu2 X-Received: by 2002:a17:906:b288:: with SMTP id q8mr503521ejz.210.1615422727307; Wed, 10 Mar 2021 16:32:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615422727; cv=none; d=google.com; s=arc-20160816; b=XYtD9LLq2OaZ/PfVuI7E8yo9P1RHG8O5lj1GHkK29BWXj3gJVJ290itbbBrMlbP3Jo LLF1tGY2rGOO95iRir/DQr2vZbkOBuwh9uZAdLsvFbi/UEqaTh8v1IHQraxsEwm6TcRn 8yWXWQkNqfnSJxjxzdhJGRAMoqB5g074+yHGs9NgSm5o+VOZfrU/x6X0mdp9QuSuCLnu F+uTdiZDjHrJl3piPQrsiByxqoX9saa8igCCW0AEk94OnXvuDprjzYEtE16xz4jDazMl 1hZOhNqOdEvl60VzW5inhbjtIAOIITy09NUrznXsUqFapYux7rN15AWYN8pEhgyPHUGj Q2jA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Z3YKt4+akWhMshxfqPj2vK5QRPyAtX0nM+/P8ZuJzWI=; b=jm0CDpBK8qu7R/zdIWygo+QfkziSio4n7c0KAzzm7WN0ZX67s1DXcsQwKdzAoD4tRf US8xWe4WQ84S0BwCdcbVAhqIPvwCgKm9lFLLdqKo5vSsH8jZgvXnLVb2wD3HhH0tf/ke f//mv3NCKgjn7uTBmGsG1mXQPja0t91vsvfEct25l6UeINxTw3BMB9KR0bEVto54vIVI kdwbaNwYPGmXeA0HNTTW4rML+k0TAy/izRjJYv2KAY2zi2bndDK9cE+f95NOiIDo4zXd 753zFSMuzJ3yVdzaXAj+mo30kfWbxTks4wJw8QNd7cCcM1REeJWfvxvxsq1rl8Qe2gjJ NZ/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=kQhtmvIu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r6si681924eda.292.2021.03.10.16.31.45; Wed, 10 Mar 2021 16:32:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=kQhtmvIu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229469AbhCKA2e (ORCPT + 99 others); Wed, 10 Mar 2021 19:28:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229570AbhCKA2K (ORCPT ); Wed, 10 Mar 2021 19:28:10 -0500 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16DE3C061574 for ; Wed, 10 Mar 2021 16:28:10 -0800 (PST) Received: by mail-qt1-x82e.google.com with SMTP id 18so86045qty.3 for ; Wed, 10 Mar 2021 16:28:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Z3YKt4+akWhMshxfqPj2vK5QRPyAtX0nM+/P8ZuJzWI=; b=kQhtmvIuopCDkiD0BrNpbYQWaqlyX9kPT1SkJiwKsAHPqXaEoOclDK4//zisHVBpaR 0RLcCC5s98ZVJcd8PswziLh/32dormacZQDWfNeszwexNBmNbXJF7pNQHHKkPB53t4K3 ayLIVAnbQowJ1GQrjyNvZ1cs2veHOHWV8rTSFIaxWjkCH03RHSm91YUvNRp4LMLuIe/p DqbCuO/XSMeOcvGdNE+RgZ73LPwsXVuUX6ONyxAoP9zTXxpTYc6FR5n7tkWCXQfBafpc 5Bpd+f/qLd3nUNE+dymOxq4r+Tk3I2pC185Wt6UQ/uLKt//8RQLrF+sRmn6MQBU9+snM 9Ubw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Z3YKt4+akWhMshxfqPj2vK5QRPyAtX0nM+/P8ZuJzWI=; b=bpHdo17V5mg3tUd5XL51IrwObumminypzgrvleiodIV6lTsiXvHmuTzyN2uHFtkiKs eNGTb28xoWaNUiptZf/L6mD7ybwQQTuSwCbCpS2oz2ipcTIqy2QXSMml0zkNLOn55Woi ziRzxXr/koh9T+cunyhWG0OAbLhcCVI8TKWsVcq6Xl4Ud9McmQregxddemh5sW+COUBI 0/4OOjGnKLuECfbckbexEDDaL53ek5edNhppoYnNOCUblZxGtmFpn5rVgqZ+ZwJ37j1W eOnUvtUz/RqWRQc9VrRrIwbeYx3K3LUasvFE/0EvT7icaOo/LN+8Guc329Q3q1mT0ZDe 7MYg== X-Gm-Message-State: AOAM533iTFQGXLyacxfrZfNlCYxfvG6Kynd6y0+Y6kYme9oDdXNZsy2j TyQe4Hcij12qvJ94ijjrBQ6BOb9NqDCIjw== X-Received: by 2002:ac8:6059:: with SMTP id k25mr5338430qtm.251.1615422489207; Wed, 10 Mar 2021 16:28:09 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-115-133.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.115.133]) by smtp.gmail.com with ESMTPSA id l65sm728774qkf.113.2021.03.10.16.28.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 16:28:08 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1lK9BP-00Aywe-Ve; Wed, 10 Mar 2021 20:28:07 -0400 Date: Wed, 10 Mar 2021 20:28:07 -0400 From: Jason Gunthorpe To: Sean Christopherson Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, David Rientjes , Ben Gardon , Michal Hocko , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrea Arcangeli , Johannes Weiner , Dimitri Sivanich Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired with range_start() Message-ID: <20210311002807.GQ444867@ziepe.ca> References: <20210310213117.1444147-1-seanjc@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210310213117.1444147-1-seanjc@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 10, 2021 at 01:31:17PM -0800, Sean Christopherson wrote: > Invoke the MMU notifier's .invalidate_range_end() callbacks even if one > of the .invalidate_range_start() callbacks failed. If there are multiple > notifiers, the notifier that did not fail may have performed actions in > its ...start() that it expects to unwind via ...end(). Per the > mmu_notifier_ops documentation, ...start() and ...end() must be paired. No this is not OK, if invalidate_start returns EBUSY invalidate_end should *not* be called. As you observed: > The only in-kernel usage that is fatally broken is the SGI UV GRU driver, > which effectively blocks and sleeps fault handlers during ...start(), and > unblocks/wakes the handlers during ...end(). But, the only users that > can fail ...start() are the i915 and Nouveau drivers, which are unlikely > to collide with the SGI driver. It used to be worse but I've since moved most of the other problematic users to the itree notifier which doesn't have the problem. > KVM is the only other user of ...end(), and while KVM also blocks fault > handlers in ...start(), the fault handlers do not sleep and originate in KVM will have its mmu_notifier_count become imbalanced: static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { kvm->mmu_notifier_count++; static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { kvm->mmu_notifier_count--; Which I believe is fatal to kvm? These notifiers certainly do not only happen at process exit. So, both of the remaining _end users become corrupted with this patch! I've tried to fix this before, the only thing that seems like it will work is to sort the hlist and only call ends that have succeeded their starts by comparing pointers with <. This is because the hlist can have items removed concurrently under SRCU so there is no easy way to compute the subset that succeeded in calling start. I had a prior effort to just ban more than 1 hlist notifier with end, but it turns out kvm on ARM uses two all the time (IIRC) > Found by inspection. Verified by adding a second notifier in KVM > that AFAIK it is a non-problem in real life because kvm is not mixed with notifier_start's that fail (and GRU is dead?). Everything else was fixed by moving to itree. Jason