Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1102575ybb; Wed, 25 Mar 2020 15:57:23 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtY9EdjwGh1JTSwZ0CUWqCdqJbwofalCamp31GDDVHM7HYdrzXubLZ3uXkKCnGSevVoIBaR X-Received: by 2002:a4a:b30b:: with SMTP id m11mr3190471ooo.54.1585177042901; Wed, 25 Mar 2020 15:57:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585177042; cv=none; d=google.com; s=arc-20160816; b=Uq2ty0qfSS9bb/Yxobp2N2O1Kz9bwYw+Q2URg+Kun9b9cISj5EhWnlSb0GCHy5HO4i U6nlVlV3tzoH5MX0Z0YdVAz9SaFGhKRm3d1At9t/MV5ng92LAkcuBlEz9ev8as/Lthsk HaHh2wfTofBRS6uejXRbJ2BDIctCTg4svZGcfeJZKngcd6xL8viA7uPKy+zUItJYWhOy l7iCGaNIpKjUA/NV3GNoXmB1A2GcYRvaDEB3mb2fMhseAeGqGrAWob23mDfguIIPKVF0 hk1ZwXiT6HkZhCiFOqXVNY/Sb6iFYtbMXC92f98Zodo+o2ifd7GhLSORz0JrAdkUrvaE UvSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ohAWFBJQsgkoA1nFpeh6U+WJa1ZEf4/+VDN1X50qyIM=; b=DHxJRH6tMehgTOXQUL7wkzd4S+sxRgDnWCUElWvUfqtnsMhwj7ZgTFxEPvOLb3OBHF D9bbHuC4CW3iiF8d4mGTNPCFAkboPNSGXRx8g3b3bOKLXyNm6AeEhr3CWzb9fOsU1m5B zzPArOc67wZrIFCCVJFlFCySGMsL87ciijH/VrpCVaV1O+tAglPa5ePQDwha2f5gbMbc PeW6/b6GSdM1SfGZ3s0MvfxO6XfFEN/ICxE5SB4ItJU0E5tMoq2jNKON7mXW7kWiPEk6 pIWGlK2ef/dXFGW5TkzGSlNFfzk+pjm9hrbehD9rXZNmq6uaUACF3EJEqIWj/rfsWOhc zVFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oVbQTDqT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b17si221953otq.269.2020.03.25.15.57.10; Wed, 25 Mar 2020 15:57:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oVbQTDqT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727685AbgCYW4g (ORCPT + 99 others); Wed, 25 Mar 2020 18:56:36 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:56154 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727406AbgCYW4f (ORCPT ); Wed, 25 Mar 2020 18:56:35 -0400 Received: by mail-wm1-f68.google.com with SMTP id z5so4614199wml.5; Wed, 25 Mar 2020 15:56:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ohAWFBJQsgkoA1nFpeh6U+WJa1ZEf4/+VDN1X50qyIM=; b=oVbQTDqTgZY2HGawhmaH1+TiMveSbwgOVwkxKpq/04rig9hVLUrBksr6kHeDkuvQtV EOof3lFD7j67Aid+jw1LLYgCBoeBwl0E17cOnKlwKAEyJ2bFuxffskWwDVTK6BYkw+XH dHJUo4jKmLAg+OUWabf0ONig8kruE6AuhVoB+h7HrarLgdJb/RUVvpi40d4ayJiyGRqT ZKHDFtvq8njmk2kC74Ub2bflo8fLjxwRIaNkoTRTrXzbLnj9OolpUCzVh4/qTPXCM/Qy bZ34g/7Sgr8GWATaaXiYHMKDfmRA1AaMuggQUIUJhb9m7PZO1uOx/0j35eoe9c45aheE CX+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ohAWFBJQsgkoA1nFpeh6U+WJa1ZEf4/+VDN1X50qyIM=; b=qUzbXnfuMegK68DNEr+U6NQpR39VV8zD9Z3IRf7Mw/vkHGgxG7UfdvZmVCzvi2lJ4b WCWZg9AmHRE5Z3e1N5/VGzfbq/nuQ8lpaRod87in9fFAVZ5ZwhU0QR4Y4wT+NxGzw+ga o1OCEilCHZ8+rvhVHJVK8DbGoQqqauvkl5G9Zq18U+kSLNhI6NSrZ39O0OswCXPmV1Hm l6MYkD+RmCJ+DOTZsjDNxPQZSHYf86On6QfyyPGDHo8ynqP244DPYQCQ4ydyCcaL+nOe kZpriOOLmacul7tgP5e934ZpPppEHCtATUVEep/4y6co6oAzDsnwqI8Q/Fd8KM/ue1qc 3bNQ== X-Gm-Message-State: ANhLgQ2RfshJexs0kmFKRFi6r5ZeBD7IhAaAFGeqrAsUYZ/9zdPrKtDV AZ4tQAcadvn7/BKpDEEoUsIg6WSPL4ebMW9k X-Received: by 2002:a1c:721a:: with SMTP id n26mr5788219wmc.25.1585176993094; Wed, 25 Mar 2020 15:56:33 -0700 (PDT) Received: from andrea.corp.microsoft.com ([86.61.236.197]) by smtp.gmail.com with ESMTPSA id q72sm790278wme.31.2020.03.25.15.56.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Mar 2020 15:56:32 -0700 (PDT) From: "Andrea Parri (Microsoft)" To: linux-kernel@vger.kernel.org Cc: "K . Y . Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , linux-hyperv@vger.kernel.org, Michael Kelley , Dexuan Cui , Boqun Feng , Vitaly Kuznetsov , "Andrea Parri (Microsoft)" Subject: [RFC PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal Date: Wed, 25 Mar 2020 23:55:00 +0100 Message-Id: <20200325225505.23998-7-parri.andrea@gmail.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20200325225505.23998-1-parri.andrea@gmail.com> References: <20200325225505.23998-1-parri.andrea@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since vmbus_chan_sched() dereferences the ring buffer pointer, we have to make sure that the ring buffer data structures don't get freed while such dereferencing is happening. Current code does this by sending an IPI to the CPU that is allowed to access that ring buffer from interrupt level, cf., vmbus_reset_channel_cb(). But with the new functionality to allow changing the CPU that a channel will interrupt, we can't be sure what CPU will be running the vmbus_chan_sched() function for a particular channel, so the current IPI mechanism is infeasible. Instead synchronize vmbus_chan_sched() and vmbus_reset_channel_cb() by using the (newly introduced) per-channel spin lock "sched_lock". Move the test for onchannel_callback being NULL before the "switch" control statement in vmbus_chan_sched(), in order to not access the ring buffer if the vmbus_reset_channel_cb() has been completed on the channel. Suggested-by: Michael Kelley Signed-off-by: Andrea Parri (Microsoft) --- drivers/hv/channel.c | 24 +++++++----------------- drivers/hv/channel_mgmt.c | 1 + drivers/hv/vmbus_drv.c | 30 +++++++++++++++++------------- include/linux/hyperv.h | 6 ++++++ 4 files changed, 31 insertions(+), 30 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 256ee90c74460..132e476f87b2e 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -594,15 +594,10 @@ int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 gpadl_handle) } EXPORT_SYMBOL_GPL(vmbus_teardown_gpadl); -static void reset_channel_cb(void *arg) -{ - struct vmbus_channel *channel = arg; - - channel->onchannel_callback = NULL; -} - void vmbus_reset_channel_cb(struct vmbus_channel *channel) { + unsigned long flags; + /* * vmbus_on_event(), running in the per-channel tasklet, can race * with vmbus_close_internal() in the case of SMP guest, e.g., when @@ -618,17 +613,12 @@ void vmbus_reset_channel_cb(struct vmbus_channel *channel) */ tasklet_disable(&channel->callback_event); - channel->sc_creation_callback = NULL; + /* See the inline comments in vmbus_chan_sched(). */ + spin_lock_irqsave(&channel->sched_lock, flags); + channel->onchannel_callback = NULL; + spin_unlock_irqrestore(&channel->sched_lock, flags); - /* Stop the callback asap */ - if (channel->target_cpu != get_cpu()) { - put_cpu(); - smp_call_function_single(channel->target_cpu, reset_channel_cb, - channel, true); - } else { - reset_channel_cb(channel); - put_cpu(); - } + channel->sc_creation_callback = NULL; /* Re-enable tasklet for use on re-open */ tasklet_enable(&channel->callback_event); diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 9b1449c839575..c53f58ba06dcf 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -315,6 +315,7 @@ static struct vmbus_channel *alloc_channel(void) if (!channel) return NULL; + spin_lock_init(&channel->sched_lock); spin_lock_init(&channel->lock); init_completion(&channel->rescind_event); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 301e3f484bb1a..ebe2716f583d2 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1147,18 +1147,6 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel) } #endif /* CONFIG_PM_SLEEP */ -/* - * Direct callback for channels using other deferred processing - */ -static void vmbus_channel_isr(struct vmbus_channel *channel) -{ - void (*callback_fn)(void *); - - callback_fn = READ_ONCE(channel->onchannel_callback); - if (likely(callback_fn != NULL)) - (*callback_fn)(channel->channel_callback_context); -} - /* * Schedule all channels with events pending */ @@ -1189,6 +1177,7 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) return; for_each_set_bit(relid, recv_int_page, maxbits) { + void (*callback_fn)(void *context); struct vmbus_channel *channel; if (!sync_test_and_clear_bit(relid, recv_int_page)) @@ -1214,13 +1203,26 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) if (channel->rescind) goto sched_unlock_rcu; + /* + * Make sure that the ring buffer data structure doesn't get + * freed while we dereference the ring buffer pointer. Test + * for the channel's onchannel_callback being NULL within a + * sched_lock critical section. See also the inline comments + * in vmbus_reset_channel_cb(). + */ + spin_lock(&channel->sched_lock); + + callback_fn = channel->onchannel_callback; + if (unlikely(callback_fn == NULL)) + goto sched_unlock; + trace_vmbus_chan_sched(channel); ++channel->interrupts; switch (channel->callback_mode) { case HV_CALL_ISR: - vmbus_channel_isr(channel); + (*callback_fn)(channel->channel_callback_context); break; case HV_CALL_BATCHED: @@ -1230,6 +1232,8 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu) tasklet_schedule(&channel->callback_event); } +sched_unlock: + spin_unlock(&channel->sched_lock); sched_unlock_rcu: rcu_read_unlock(); } diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6c794fd5c903e..ce32ab186192f 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -771,6 +771,12 @@ struct vmbus_channel { void (*onchannel_callback)(void *context); void *channel_callback_context; + /* + * Synchronize channel scheduling and channel removal; see the inline + * comments in vmbus_chan_sched() and vmbus_reset_channel_cb(). + */ + spinlock_t sched_lock; + /* * A channel can be marked for one of three modes of reading: * BATCHED - callback called from taslket and should read -- 2.24.0