Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp393727rwd; Tue, 16 May 2023 02:31:59 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7EqUD5/mJ4Ja9eGPAbossPoE+155Qw3S9MyZmq7l/3LdeG2vY3oKVD2yBOUo9nW06ty5+X X-Received: by 2002:a17:90a:348d:b0:250:d2d8:c179 with SMTP id p13-20020a17090a348d00b00250d2d8c179mr21247485pjb.29.1684229519225; Tue, 16 May 2023 02:31:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684229519; cv=none; d=google.com; s=arc-20160816; b=C/C8y74ZNy8dz0XtpQP7seCvGNtxtloYd8cQRQ5DITqDUYPJmRkIY5RwzI5LSPYTUn U6f2WEIkrr3MW9DETKo+b90u0Iz3IfqAd9sQYuLerAxJlkVmMGXDz27+KWkyyBPQq84g LjY4PtQrd1+/xD74CtVvWMqAH3d+z3JU3+b8cGuKGkd3Cit/a3O+np80E1BbY5SESiOa 2jVdHrFbRBUS3jbCmJsj5wCAk22WCobURtofJ4NQXzsiwq/kybsm8H8FgBzXvBkfajsV cfUJsLlBc6sNpRjJ0woiUD+I5iaLNcVKg4jyvU+Rg3YLP02VMhJxTu8uvvZmaxIkc6mn 1cbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=+o/UCn3L1P/wSUbplutGB2edri0j4YA53Agzrtt+lMk=; b=ZLvvungofPPt8x8lGEHCjtbAJhq3EHmkKsXWX/ARW+pCQUIJRUWeG3S4/2AbqErali 7WSRtn3swYIi9Ci8LigIyTz99ciT0fVje+DAlNIXEq52rVZkxyNTO6tc+SDlRpng2cW9 +lia8nfflwNH/OIP3Uv1p6TBKp+WmWDFhUNQL6/TPCUl1ooXrpQtuow04jpet2yOOzAW cPQcrC5yhsgOeOW3u24vC3phvTq1xQYkXxlhz4TJMv9NfcOjrXwuE4ACaWvbZyd+RtOf P+00m+2atd0kJocAVY0GQPbQgV10BuCehSGhfgVupAmYO9i3TSBnkpi6OM+guXr6P7/a PMkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bAioCp5x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rm9-20020a17090b3ec900b0024e14acda38si1448667pjb.61.2023.05.16.02.31.46; Tue, 16 May 2023 02:31:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bAioCp5x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231812AbjEPJ1f (ORCPT + 99 others); Tue, 16 May 2023 05:27:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232147AbjEPJ1M (ORCPT ); Tue, 16 May 2023 05:27:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72D735FE2 for ; Tue, 16 May 2023 02:26:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684229166; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+o/UCn3L1P/wSUbplutGB2edri0j4YA53Agzrtt+lMk=; b=bAioCp5xOKx7LQXlSh8DZa/SK7ggrGpSSEiADP0YNw+a08zQyld8RSFSVoopXmyCnGdJQd FI9lhUJS1yy0gNnLi3xD5BZP9hBZKpgmPJ7XtwuR6H8xHn59EQOTIKK/TfsUBpQbbubDKo /licI8crC/HoTHykssM4erYEVkdeHLA= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-270-jCAOdciVM4yrZ-Rp4X_mkQ-1; Tue, 16 May 2023 05:11:36 -0400 X-MC-Unique: jCAOdciVM4yrZ-Rp4X_mkQ-1 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-751409fae81so1532765185a.1 for ; Tue, 16 May 2023 02:11:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684228296; x=1686820296; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+o/UCn3L1P/wSUbplutGB2edri0j4YA53Agzrtt+lMk=; b=F+Ya9u4iQEdAWaUihha3rHte9ZcalmibODGVkwJGeAjnpdCl9ITrJzPoUNsifplDzO rYWV7R5biDT680gGd8afsG8xRDe9DQ4MFnp6XGosg1LlLM3BVfOG0VCNNkUCVNklHS8t T5YMXgCHwerMxmumVWZxDWfHU+iQKvTUTNMicoDw+xIBFhUfKvftUs3YfUzvX+OEMX+/ Co/o5H4NZyM+G0Ow7hwNkis2x7CknYld499Aq0ml0C2bun30SuQFe57WZNTvoO6j52Ml x2ThkXeSeKazNcxyttixkE7Ed6XuzT/ZipDP4R040RXvnt7A7pgmNv/9R1a3mIIdeda2 TNEA== X-Gm-Message-State: AC+VfDw92+kaTcbgp63IG0n+7rCliIdUj+8ZDkOmbVAzF8FiPPbBjNfn iRgtqDcI7vMnEMj1oT9eSGbG58qfsvBj1YGxpUD/s617mdYkjGiraCyh0eOATvmqNODID8+eoYk ALGPShF29OhJGQ7UhoMohgaNC X-Received: by 2002:a05:6214:29e4:b0:5ef:739a:1c46 with SMTP id jv4-20020a05621429e400b005ef739a1c46mr54133610qvb.1.1684228296190; Tue, 16 May 2023 02:11:36 -0700 (PDT) X-Received: by 2002:a05:6214:29e4:b0:5ef:739a:1c46 with SMTP id jv4-20020a05621429e400b005ef739a1c46mr54133583qvb.1.1684228295813; Tue, 16 May 2023 02:11:35 -0700 (PDT) Received: from fedora (g2.ign.cz. [91.219.240.8]) by smtp.gmail.com with ESMTPSA id k3-20020ac80203000000b003e39106bdb2sm6105296qtg.31.2023.05.16.02.11.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 May 2023 02:11:35 -0700 (PDT) From: Vitaly Kuznetsov To: Michael Kelley Cc: stable@vger.kernel.org, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, decui@microsoft.com Subject: Re: [PATCH 1/1] Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs In-Reply-To: <1684172191-17100-1-git-send-email-mikelley@microsoft.com> References: <1684172191-17100-1-git-send-email-mikelley@microsoft.com> Date: Tue, 16 May 2023 11:11:32 +0200 Message-ID: <87pm707i9n.fsf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michael Kelley writes: > vmbus_wait_for_unload() may be called in the panic path after other > CPUs are stopped. vmbus_wait_for_unload() currently loops through > online CPUs looking for the UNLOAD response message. But the values of > CONFIG_KEXEC_CORE and crash_kexec_post_notifiers affect the path used > to stop the other CPUs, and in one of the paths the stopped CPUs > are removed from cpu_online_mask. This removal happens in both > x86/x64 and arm64 architectures. In such a case, vmbus_wait_for_unload() > only checks the panic'ing CPU, and misses the UNLOAD response message > except when the panic'ing CPU is CPU 0. vmbus_wait_for_unload() > eventually times out, but only after waiting 100 seconds. > > Fix this by looping through *present* CPUs in vmbus_wait_for_unload(). > The cpu_present_mask is not modified by stopping the other CPUs in the > panic path, nor should it be. Furthermore, the synic_message_page > being checked in vmbus_wait_for_unload() is allocated in > hv_synic_alloc() for all present CPUs. So looping through the > present CPUs is more consistent. > > For additional safety, also add a check for the message_page being > NULL before looking for the UNLOAD response message. > > Reported-by: John Starks > Fixes: cd95aad55793 ("Drivers: hv: vmbus: handle various crash scenarios") I see you Cc:ed stable@ on the patch, should we also add Cc: stable@vger.kernel.org here explicitly so it gets picked up by various stable backporting scritps? I guess Wei can do it when picking the patch to the queue... > Signed-off-by: Michael Kelley > --- > drivers/hv/channel_mgmt.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > index 007f26d..df2ba20 100644 > --- a/drivers/hv/channel_mgmt.c > +++ b/drivers/hv/channel_mgmt.c > @@ -829,11 +829,14 @@ static void vmbus_wait_for_unload(void) > if (completion_done(&vmbus_connection.unload_event)) > goto completed; > > - for_each_online_cpu(cpu) { > + for_each_present_cpu(cpu) { > struct hv_per_cpu_context *hv_cpu > = per_cpu_ptr(hv_context.cpu_context, cpu); > > page_addr = hv_cpu->synic_message_page; > + if (!page_addr) > + continue; > + In theory, synic_message_page for all present CPUs is permanently assigned in hv_synic_alloc() and we fail the whole thing if any of these allocations fail so page_addr == NULL is likely impossible today but there's certainly no harm in having this extra check here, this is not a hotpath. > msg = (struct hv_message *)page_addr > + VMBUS_MESSAGE_SINT; > > @@ -867,11 +870,14 @@ static void vmbus_wait_for_unload(void) > * maybe-pending messages on all CPUs to be able to receive new > * messages after we reconnect. > */ > - for_each_online_cpu(cpu) { > + for_each_present_cpu(cpu) { > struct hv_per_cpu_context *hv_cpu > = per_cpu_ptr(hv_context.cpu_context, cpu); > > page_addr = hv_cpu->synic_message_page; > + if (!page_addr) > + continue; > + > msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; > msg->header.message_type = HVMSG_NONE; > } Reviewed-by: Vitaly Kuznetsov -- Vitaly