Received: by 2002:a19:651b:0:0:0:0:0 with SMTP id z27csp840456lfb; Fri, 13 May 2022 21:27:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw8ik9Qv/WgWO8RRMB8aVHpCAxMhI/RBi/hqyR6jgktO2qJgjXRQDj/JbKuUkxmUs4zO+2Q X-Received: by 2002:adf:fb04:0:b0:20c:dcbb:95bc with SMTP id c4-20020adffb04000000b0020cdcbb95bcmr6333361wrr.393.1652502453954; Fri, 13 May 2022 21:27:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652502453; cv=none; d=google.com; s=arc-20160816; b=r9J/MPtlAHweiLeppwFf65TRwfBA73PVDlYbCFoxC8zVlsCk27HS7SWofeCn2RiQfA ztkxQhNTCTHsqfo7CZWgY+uV6mkQrGmDpUKYeFlCXFoc6ZxQ8LdU9IL2CCctQJDXTmLX 0kxJXvQrSjEb7/WNrmvewMeEXZ54r5RdvhldVLuGHoBZpOCVfxjhnci2q0oK3PZ5mnmw Xsn3CFUR1VFOhYiXRRnzJYtj3l2OTswV+XKC3quWEPO9W9vfjJvgPy740Q2XgDzsP+8H 1oPZgTU4HquDCZP9mzEa6Tl8ob9pzMgu9KX4YctYP2vIubsBwPC2nqxyWD3I7mWqTW2t lu+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject:from :references:cc:to; bh=no9UhJ/x9rNAuM7G7upWeYLPOKoMZxNXtbELEUoqliM=; b=uTsS4L9WM1pAc781tyPK7bn8smYbgriZjHibDd+72FJpg5g4Yg7QFJc1jHOBGzXFII 1fTRSbrzQJ4914eau/azz6QngyuZfA61pOdwqfRzABgEsdXIaxdIkeMeHftDm38cW24W +2jDU4QXgAhr0X2z5GWUf6K8jo69RT5KokhQKf74WnEL+6Zg6GxO0aWrQjAJ0ahsv/7O vihHQzYkBtncgzmdBv3WTK+oB1hAyXbwaZwyM5pn0xPjuNOIgrD5AAZgN5zzJGUTclxD kIJ3kKK2unxCTJftz/KIKs8XfXbmvP5qmgAB7Y/bpE/qAoDQfujuAoToEO/Zd0jwavIs 7CDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=csail.mit.edu Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id f18-20020a7bcc12000000b0038c90f60d42si4200197wmh.230.2022.05.13.21.27.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 May 2022 21:27:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=csail.mit.edu Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5D98B35D0C5; Fri, 13 May 2022 19:54:57 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230321AbiENCys (ORCPT + 99 others); Fri, 13 May 2022 22:54:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230484AbiENCyk (ORCPT ); Fri, 13 May 2022 22:54:40 -0400 X-Greylist: delayed 5901 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Fri, 13 May 2022 18:11:27 PDT Received: from outgoing-stata.csail.mit.edu (outgoing-stata.csail.mit.edu [128.30.2.210]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6591E37B8CB for ; Fri, 13 May 2022 18:11:27 -0700 (PDT) Received: from [128.177.82.146] (helo=srivatsab-a02.vmware.com) by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1npemM-000Vex-0L; Fri, 13 May 2022 19:33:02 -0400 To: Elliot Berman , Juergen Gross , Alexey Makhalov , Catalin Marinas , Will Deacon Cc: Prakruthi Deepak Heragu , virtualization@lists.linux-foundation.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Murali Nalajala References: <20220513174654.362169-1-quic_eberman@quicinc.com> From: "Srivatsa S. Bhat" Subject: Re: [PATCH v3] arm64: paravirt: Use RCU read locks to guard stolen_time Message-ID: Date: Fri, 13 May 2022 16:32:53 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <20220513174654.362169-1-quic_eberman@quicinc.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/13/22 10:46 AM, Elliot Berman wrote: > From: Prakruthi Deepak Heragu > > During hotplug, the stolen time data structure is unmapped and memset. > There is a possibility of the timer IRQ being triggered before memset > and stolen time is getting updated as part of this timer IRQ handler. This > causes the below crash in timer handler - > > [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 > ... > [ 3458.154398][ C5] Call trace: > [ 3458.157648][ C5] para_steal_clock+0x30/0x50 > [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 > [ 3458.168148][ C5] account_process_tick+0x3c/0x280 > [ 3458.173274][ C5] update_process_times+0x5c/0xf4 > [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 > [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c > [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 > [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 > [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 > [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 > [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 > [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 > [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c > [ 3458.223156][ C5] el1_interrupt+0x34/0x64 > [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c > [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 > [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c > [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 > [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 > [ 3458.251656][ C5] __vunmap+0x154/0x278 > [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 > [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 > [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 > [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 > [ 3458.276638][ C5] kthread+0x17c/0x1e0 > [ 3458.280691][ C5] ret_from_fork+0x10/0x20 > > As a fix, introduce rcu lock to update stolen time structure. > > Suggested-by: Will Deacon > Signed-off-by: Prakruthi Deepak Heragu > Signed-off-by: Elliot Berman > --- Looks good to me, but one quick question though (see below). Reviewed-by: Srivatsa S. Bhat (VMware) > > static int stolen_time_cpu_down_prepare(unsigned int cpu) > { > + struct pvclock_vcpu_stolen_time *kaddr = NULL; > struct pv_time_stolen_time_region *reg; > > reg = this_cpu_ptr(&stolen_time_region); > if (!reg->kaddr) > return 0; > > - memunmap(reg->kaddr); > - memset(reg, 0, sizeof(*reg)); > + kaddr = rcu_replace_pointer(reg->kaddr, NULL, true); > + synchronize_rcu(); > + memunmap(kaddr); > The original code used to memset the stolen time region, but this patch seems to drop it. Was that change intentional? Regards, Srivatsa