Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1513853pxu; Thu, 8 Oct 2020 13:29:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZPW0kWofco+iAkqZ3X191aE2jxWptB21JKsemIb73PfSBZUCALfChiHbk9ZEjCRCbRsIC X-Received: by 2002:a17:906:494:: with SMTP id f20mr10523954eja.285.1602188952404; Thu, 08 Oct 2020 13:29:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602188952; cv=none; d=google.com; s=arc-20160816; b=VpddZCXsrdGiuE7/fZPi6wwqXHbUFI5gRs2U1RJwALwzDLMVvoZfsK5BOcw2eCEtrf uQ+LKvV+Ma3/+nymGDIol5TgaIqHJY71zwUVX/Z75/svdyCsE/uYvtgSu33UdsvwYF0c I1tQ3FV/ORVWHn8/B/hfdkl9ZNSfcvK74TxrqrArJzHFw9nPLcyiqfJKMi5Gg8dEvTQg LCoenRKZJk4473a0Gu22e9lGHvYfxR8cHEw/DlorMswq+4t8hyeHdBr9EEYfnEeWFgjR IYglAsPJfAKmVgG8mlYFaUs1bIXCdpVYDocmupSKN8ipAMeap5trrlDdfhFCVECpdlTf RINQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date:dkim-signature; bh=lURzBj31wTKLAtfiNJJr2ore5MTkKc6jEzho5UNvNLk=; b=N2jFEPolnE51Ab4FnxWc2NVOvrvXvO7iPAPsYjSlVT90eTciGRVZdhoGeZJDzbdGqE daFnBSN2YwlD9R6xzApuaTPaB1SNcZwz6HbbHdTAPw8djU5uthbk6MpyzlfJNixAv1qr Hyh3olmes0K2Q7s6Op8RnZrNgxF06ZSJHEPzIhlIKtV8GsK54Lygj+dJNmrVrwQmgUUW iP9JY/zI7S2Keuu92H2eUmNm9AWzKrBS9CfIXS1h+cWjZVlQ3OVMqqhweCHHP7BwNHKE 9RycuAdPm0o5ooIy1iqRlg0H4YXpqvP3uUnDLG5NGJSKdQ3kPEEgFREo4xWWvDccPGcT GEZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mG7HHv90; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q16si4697766ejt.139.2020.10.08.13.28.49; Thu, 08 Oct 2020 13:29:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=mG7HHv90; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732081AbgJHRan (ORCPT + 99 others); Thu, 8 Oct 2020 13:30:43 -0400 Received: from mail.kernel.org ([198.145.29.99]:43302 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729377AbgJHRan (ORCPT ); Thu, 8 Oct 2020 13:30:43 -0400 Received: from sstabellini-ThinkPad-T480s (c-24-130-65-46.hsd1.ca.comcast.net [24.130.65.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6402222200; Thu, 8 Oct 2020 17:30:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1602178241; bh=7ueGKOqJhqJU4X35iZZ5iwrpzJ9RJyWVf3WNj3wZ6og=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=mG7HHv90QHFOAyD/+NDvDDhgWHFoFKUHwrOoEmgtBWFMY/YSPkpVZxyu3QFBck3AE 2ofxuvpnrGA7qfo+MEdGBfcFVC1sOUy5/kJv0UEtH5Lyd+ZudJ2ECBghtii8yTSwxN 6VzQ6RZLQQKWBm9Q3wvxQQdxqUMr5nr9oXXLFMlQ= Date: Thu, 8 Oct 2020 10:30:40 -0700 (PDT) From: Stefano Stabellini X-X-Sender: sstabellini@sstabellini-ThinkPad-T480s To: Masami Hiramatsu cc: Stefano Stabellini , Julien Grall , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, =?UTF-8?Q?Alex_Benn=C3=A9e?= , takahiro.akashi@linaro.org, jgross@suse.com, boris.ostrovsky@oracle.com Subject: Re: [PATCH] arm/arm64: xen: Fix to convert percpu address to gfn correctly In-Reply-To: <20201008172806.1591ebb538946c5ee93d372a@kernel.org> Message-ID: References: <160190516028.40160.9733543991325671759.stgit@devnote2> <20201006114058.b93839b1b8f35a470874572b@kernel.org> <20201008172806.1591ebb538946c5ee93d372a@kernel.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 8 Oct 2020, Masami Hiramatsu wrote: > On Tue, 6 Oct 2020 10:56:52 -0700 (PDT) > Stefano Stabellini wrote: > > > On Tue, 6 Oct 2020, Masami Hiramatsu wrote: > > > On Mon, 5 Oct 2020 18:13:22 -0700 (PDT) > > > Stefano Stabellini wrote: > > > > > > > On Mon, 5 Oct 2020, Julien Grall wrote: > > > > > Hi Masami, > > > > > > > > > > On 05/10/2020 14:39, Masami Hiramatsu wrote: > > > > > > Use per_cpu_ptr_to_phys() instead of virt_to_phys() for per-cpu > > > > > > address conversion. > > > > > > > > > > > > In xen_starting_cpu(), per-cpu xen_vcpu_info address is converted > > > > > > to gfn by virt_to_gfn() macro. However, since the virt_to_gfn(v) > > > > > > assumes the given virtual address is in contiguous kernel memory > > > > > > area, it can not convert the per-cpu memory if it is allocated on > > > > > > vmalloc area (depends on CONFIG_SMP). > > > > > > > > > > Are you sure about this? I have a .config with CONFIG_SMP=y where the per-cpu > > > > > region for CPU0 is allocated outside of vmalloc area. > > > > > > > > > > However, I was able to trigger the bug as soon as CONFIG_NUMA_BALANCING was > > > > > enabled. > > > > > > > > I cannot reproduce the issue with defconfig, but I can with Masami's > > > > kconfig. > > > > > > > > If I disable just CONFIG_NUMA_BALANCING from Masami's kconfig, the > > > > problem still appears. > > > > > > > > If I disable CONFIG_NUMA from Masami's kconfig, it works, which is > > > > strange because CONFIG_NUMA is enabled in defconfig, and defconfig > > > > works. > > > > > > Hmm, strange, because when I disabled CONFIG_NUMA_BALANCING, the issue > > > disappeared. > > > > > > --- config-5.9.0-rc4+ 2020-10-06 11:36:20.620107129 +0900 > > > +++ config-5.9.0-rc4+.buggy 2020-10-05 21:04:40.369936461 +0900 > > > @@ -131,7 +131,8 @@ > > > CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y > > > CONFIG_CC_HAS_INT128=y > > > CONFIG_ARCH_SUPPORTS_INT128=y > > > -# CONFIG_NUMA_BALANCING is not set > > > +CONFIG_NUMA_BALANCING=y > > > +CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y > > > CONFIG_CGROUPS=y > > > CONFIG_PAGE_COUNTER=y > > > CONFIG_MEMCG=y > > > > > > So buggy config just enabled NUMA_BALANCING (and default enabled) > > > > Yeah but both NUMA and NUMA_BALANCING are enabled in defconfig which > > works fine... > > Hmm, I found that the xen_vcpu_info was allocated on km if the Dom0 has > enough memory. On my environment, if Xen passed 2GB of RAM to Dom0, it > was allocated on kernel linear mapped address, but with 1GB of RAM, it > was on vmalloc area. > As far as I can see, it seems that the percpu allocates memory from > 2nd chunk if the default memory size is small. > > I've built a kernel with patch [1] and boot the same kernel up with > different dom0_mem option with "trace_event=percpu:*" kernel cmdline. > Then got following logs. > > Boot with 4GB: > -0 [000] .... 0.543208: percpu_create_chunk: base_addr=000000005d5ad71c > [...] > systemd-1 [000] .... 0.568931: percpu_alloc_percpu: reserved=0 is_atomic=0 size=48 align=8 base_addr=00000000fa92a086 off=32672 ptr=000000008da0b73d > systemd-1 [000] .... 0.568938: xen_guest_init: Xen: alloc xen_vcpu_info ffff800011003fa0 id=000000008da0b73d > systemd-1 [000] .... 0.586635: xen_starting_cpu: Xen: xen_vcpu_info ffff800011003fa0, vcpup ffff00092f4ebfa0 per_cpu_offset[0] ffff80091e4e8000 > > (NOTE: base_addr and ptr are encoded to the ids, not actual address > because of "%p" printk format) > > In this log, we can see the xen_vcpu_info is allocated NOT on the > new chunk (this is the 2nd chunk). As you can see, the vcpup is in > the kernel linear address in this case, because it came from the > 1st kernel embedded chunk. > > > Boot with 1GB > -0 [000] .... 0.516221: percpu_create_chunk: base_addr=000000008456b989 > [...] > systemd-1 [000] .... 0.541982: percpu_alloc_percpu: reserved=0 is_atomic=0 size=48 align=8 base_addr=000000008456b989 off=17920 ptr=00000000c247612d > systemd-1 [000] .... 0.541989: xen_guest_init: Xen: alloc xen_vcpu_info 7dff951f0600 id=00000000c247612d > systemd-1 [000] .... 0.559690: xen_starting_cpu: Xen: xen_vcpu_info 7dff951f0600, vcpup fffffdffbfcdc600 per_cpu_offset[0] ffff80002aaec000 > > On the other hand, when we boot the dom0 with 1GB memory, the xen_vcpu_info > is allocated on the new chunk (the id of base_addr is same). > Since the data of new chunk is allocated on vmalloc area, vcpup points > the vmalloc address. > > So, the bug seems not to depend on the kconfig, but depends on where > the percpu memory is allocated from. > > > [...] > > > > > > The fix is fine for me. I tested it and it works. We need to remove the > > > > "Fixes:" line from the commit message. Ideally, replacing it with a > > > > reference to what is the source of the problem. > > > > > > OK, as I said, it seems commit 9a9ab3cc00dc ("xen/arm: SMP support") has > > > introduced the per-cpu code. So note it instead of Fixes tag. > > > > ...and commit 9a9ab3cc00dc was already present in 5.8 which also works > > fine with your kconfig. Something else changed in 5.9 causing this > > breakage as a side effect. Commit 9a9ab3cc00dc is there since 2013, I > > think it is OK -- this patch is fixing something else. > > I think the commit 9a9ab3cc00dc theoletically wrong because it uses > __pa() on percpu address. But that is not guaranteed according to the > comment on per_cpu_ptr_to_phys() as below. > > ---- > * percpu allocator has special setup for the first chunk, which currently > * supports either embedding in linear address space or vmalloc mapping, > * and, from the second one, the backing allocator (currently either vm or > * km) provides translation. > * > * The addr can be translated simply without checking if it falls into the > * first chunk. But the current code reflects better how percpu allocator > * actually works, and the verification can discover both bugs in percpu > * allocator itself and per_cpu_ptr_to_phys() callers. So we keep current > * code. > ---- > > So we must use per_cpu_ptr_to_phys() instead of __pa() macro for percpu > address. That's why I pointed this will fix the commit 9a9ab3cc00dc. Thank you for the analysis. We are going to try to get the patch upstream as soon as we can.