Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1150243pxf; Fri, 2 Apr 2021 02:37:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz50suC/OeKsiQbYQRPSegD5N5LdjHWMIs2agVMcP26fElaMbQ7Q3VHsx34PbNcAEfXnqBy X-Received: by 2002:a17:906:d153:: with SMTP id br19mr12931665ejb.360.1617356248506; Fri, 02 Apr 2021 02:37:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617356248; cv=none; d=google.com; s=arc-20160816; b=e7uDTms5DC7A144+LmHSy+C9bUsrWbJozOAaCZzzEbWeBmm75k2Qou+uVI/RSa5mBK rwGTnKUTPViJl92rkaRqydKekgKk9Rjyaq8WY5eYJTMtWptfF5tg245N+jr5HXt3T4WQ OQnPhjE/67zAT3ZkaVWxKbzRZK7D6zZ72itdDgj6tmLgFZBIiKhVex6P+Q00mF0nFz2l ZdMJagH7v9Hq2lznveaT/0XrUSKD1XEisiSLgo6xuBz/+04+XiZIN31gVO0OpyUWQXR4 TSK8Nn17EntqP0yv/ZP5oq6PrsSgsKs3W1md9SdVOefj5plCT1Z6J+kZJVc0ZfiBiCul fVbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to:dkim-signature; bh=zJGFIBZSsezFGTTBbG13IPXImkEHWvDslizpfbPB0uQ=; b=Z/yU1tiXA/IipnH3cL4idETVkw2q4KpHe8dKwN/Z9cHs1uwt60xumc5ctLelrniqYR vc3tIt14Nw4KQaYWIWBZBqPjD8Mr+rHL9971r62i/xD8o9dmzP247eG5Yjbv59EsEZDw eW2AfnjQYhEgdq2VnscZrFXdC8KqoOLQE8Lo3IA+Ic5kF1LfzIrUE3bzxFLsMBc0jc9J ZpgBbHnafoV+VtjQFR0uZ+LbtyTpMS60viV+reFl0Cjj0jv1cYDjcacXnmJ2rnT9TTWq JuSenCYEYryiZilx9zmY6oPp0njvbaA1n2cXSVRTZjRwx+D1a63cAbzoAM1qCN4wV/IL xcSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dyaYVvQK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lj27si5936069ejb.513.2021.04.02.02.37.04; Fri, 02 Apr 2021 02:37:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dyaYVvQK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234989AbhDBJeq (ORCPT + 99 others); Fri, 2 Apr 2021 05:34:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:33677 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234877AbhDBJep (ORCPT ); Fri, 2 Apr 2021 05:34:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617356084; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zJGFIBZSsezFGTTBbG13IPXImkEHWvDslizpfbPB0uQ=; b=dyaYVvQKRelvsXllsHU2XlTeLGA4H5tQXN4vxY8lcx96C6CO3vGkhDDoJpGFozIPWz0Pda SG/TPjbugS62UeNNa3wFH9D++P+hSckV0fxjVSX5dkc6mvnDM6H8q9B5oFmf+JyARKYUg4 iSbCZFAdlNb5rCh9c/aQbNjtT7+tVr0= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-308-So4QVv0YP_qNlz3Rs3NNhg-1; Fri, 02 Apr 2021 05:34:42 -0400 X-MC-Unique: So4QVv0YP_qNlz3Rs3NNhg-1 Received: by mail-wr1-f69.google.com with SMTP id a15so4075607wrf.19 for ; Fri, 02 Apr 2021 02:34:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:subject:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=zJGFIBZSsezFGTTBbG13IPXImkEHWvDslizpfbPB0uQ=; b=F7bxwvUtqU0dKIS3qyu4TB9UNPIbiIytvRFMyHPXZHU2dzSz51HJm5bUnGklDhdvNU hAQMnJLidaMwqmQAGKksWj68HowjOfcl+1ko0H9K+gD5vGomwyMFXItGqVomeayWYy2y Hhyy2C4tCm3hCIOr8WqkRbJh49zJkHSQPwMeJVXEY1UvEhBTc7rg5upwTjwHLqM21Gie OI50weUNwbDcwfrD0SqJmt7MUfaWs6XpwO02C94f4zw5OqGo3VNQrDjF/nh7FB2PMx60 j7nVPml+qERHdYAMwkMBWPimtV/767gFPgAv3pLvfXDgwm7mxNa9eiVv/6e6h+AKg7cB lb7w== X-Gm-Message-State: AOAM531yinVlUhn8r++2LvkmaVao0ReELJ5pJZgAlOWZLcQB5gwst7Tz Zno4cNtxjhzQ/EIMQWHc556RkX2JsyTpFCfBTI7/uHNJgm/qXrTsTJsXJuurBL8RYQCRkIs355q Gy4ObHLDa5v9p63qi+nLdX62l X-Received: by 2002:a5d:4281:: with SMTP id k1mr14190076wrq.374.1617356081542; Fri, 02 Apr 2021 02:34:41 -0700 (PDT) X-Received: by 2002:a5d:4281:: with SMTP id k1mr14190061wrq.374.1617356081285; Fri, 02 Apr 2021 02:34:41 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:63a7:c72e:ea0e:6045? ([2001:b07:6468:f312:63a7:c72e:ea0e:6045]) by smtp.gmail.com with ESMTPSA id p18sm13774521wrs.68.2021.04.02.02.34.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 02 Apr 2021 02:34:40 -0700 (PDT) To: Sean Christopherson , Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras Cc: James Morse , Julien Thierry , Suzuki K Poulose , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon References: <20210402005658.3024832-1-seanjc@google.com> <20210402005658.3024832-10-seanjc@google.com> From: Paolo Bonzini Subject: Re: [PATCH v2 09/10] KVM: Don't take mmu_lock for range invalidation unless necessary Message-ID: <417bd6b5-b7d0-ed22-adae-02150cdbfebe@redhat.com> Date: Fri, 2 Apr 2021 11:34:38 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210402005658.3024832-10-seanjc@google.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/04/21 02:56, Sean Christopherson wrote: > Avoid taking mmu_lock for unrelated .invalidate_range_{start,end}() > notifications. Because mmu_notifier_count must be modified while holding > mmu_lock for write, and must always be paired across start->end to stay > balanced, lock elision must happen in both or none. To meet that > requirement, add a rwsem to prevent memslot updates across range_start() > and range_end(). > > Use a rwsem instead of a rwlock since most notifiers _allow_ blocking, > and the lock will be endl across the entire start() ... end() sequence. > If anything in the sequence sleeps, including the caller or a different > notifier, holding the spinlock would be disastrous. > > For notifiers that _disallow_ blocking, e.g. OOM reaping, simply go down > the slow path of unconditionally acquiring mmu_lock. The sane > alternative would be to try to acquire the lock and force the notifier > to retry on failure. But since OOM is currently the _only_ scenario > where blocking is disallowed attempting to optimize a guest that has been > marked for death is pointless. > > Unconditionally define and use mmu_notifier_slots_lock in the memslots > code, purely to avoid more #ifdefs. The overhead of acquiring the lock > is negligible when the lock is uncontested, which will always be the case > when the MMU notifiers are not used. > > Note, technically flag-only memslot updates could be allowed in parallel, > but stalling a memslot update for a relatively short amount of time is > not a scalability issue, and this is all more than complex enough. Proposal for the locking documentation: diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index b21a34c34a21..3e4ad7de36cb 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -16,6 +16,13 @@ The acquisition orders for mutexes are as follows: - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring them together is quite rare. +- The kvm->mmu_notifier_slots_lock rwsem ensures that pairs of + invalidate_range_start() and invalidate_range_end() callbacks + use the same memslots array. kvm->slots_lock is taken outside the + write-side critical section of kvm->mmu_notifier_slots_lock, so + MMU notifiers must not take kvm->slots_lock. No other write-side + critical sections should be added. + On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock. Everything else is a leaf: no other lock is taken inside the critical Paolo