Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp741706rwb; Tue, 29 Nov 2022 04:58:19 -0800 (PST) X-Google-Smtp-Source: AA0mqf41Watbd3HHNpTTCH406/otxuJ1N33J8AHnkeB3Eiaov5JCIxKb4Ml51RvSI29GwObHZUnq X-Received: by 2002:a17:906:2692:b0:789:d492:89f5 with SMTP id t18-20020a170906269200b00789d49289f5mr49728618ejc.103.1669726699292; Tue, 29 Nov 2022 04:58:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669726699; cv=none; d=google.com; s=arc-20160816; b=GE5DgZOycrFDkapZJgSIGOVeDoH0S8ol0Do3fkUNJyOJybS1HDJzweO4JRkQBkP4qP h1jDbbeDXf0gjHy6kzRDA9tLCcnkx0/teI5lUwxr07AgkUoO6feTPM00N3x2/d67Xn9d M1+tCCDqxs0j7BI9joxeR1yfeXASE2uUyB8rAicvF6KRsbXahUV92pCZILVT/QVigYhS 9HCFlO10I26TJVjjD5KhBmh/tU+X/deGB98JijgCluhZ0njQ6IsFpjiKTumlrI6iNC9D qaENT47emoFNtooZgj7u23AdJVHHyuvTX2LwdPaMQA1s2193a0yRQhnxf/2bXwWPZo6i yX1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=XDaLpxBZw8oaW/OpLmGucx8928AGlYtayfA+5KtyKps=; b=s9HqtMH4Ohenu5c6MtZ/OCA1UHnUQnNynNtgSFb2orQAAL80L3htPqQrjkMmfHC1J9 tHrTwWL6ijE1k5+Q8L2T91gU8YMeT+8s5svEDHiCLSUhRfnWHGTXzrRX0CXemCWxUuDy AADer2jhgDrJdi5NMg7j5nLdvIpiccZ4clbnG/4KWdYylEKMoqfCYOYVogwa9d88jDku CMB6Bx9xxktSroIKr8cTC6Shf6hHIW9Z5jHeN/wFUNTyFCIjf9Aba4V2AruGRLRGdk1M gt3kmpguJ49Tr75qyrz4oVixi2P97DJetrt2QsI0hHf+otRvngRu5TkWJL4x6qiuVNju ZdFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dd5-20020a1709069b8500b0078d44c5da0esi12995996ejc.667.2022.11.29.04.57.59; Tue, 29 Nov 2022 04:58:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234129AbiK2M1E (ORCPT + 83 others); Tue, 29 Nov 2022 07:27:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230351AbiK2M1A (ORCPT ); Tue, 29 Nov 2022 07:27:00 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFED95B594 for ; Tue, 29 Nov 2022 04:26:58 -0800 (PST) Received: from dggpeml500024.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NM1ln6cfpzHwFs; Tue, 29 Nov 2022 20:26:13 +0800 (CST) Received: from [10.174.179.176] (10.174.179.176) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 29 Nov 2022 20:26:55 +0800 Subject: Re: [PATCH 4/5] arm64: mm: Support ASID isolation feature To: Catalin Marinas CC: , , , , References: <20221017083203.3690346-1-yeyunfeng@huawei.com> <20221017083203.3690346-5-yeyunfeng@huawei.com> <3607b658-304a-ecc8-b07a-530f4a6365e8@huawei.com> From: Yunfeng Ye Message-ID: <6376ad6d-1815-c2c5-575c-2ed89b877047@huawei.com> Date: Tue, 29 Nov 2022 20:26:48 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.176] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/11/29 1:00, Catalin Marinas wrote: > On Thu, Nov 10, 2022 at 03:07:53PM +0800, Yunfeng Ye wrote: >> On 2022/11/9 20:43, Catalin Marinas wrote: >>> On Mon, Oct 17, 2022 at 04:32:02PM +0800, Yunfeng Ye wrote: >>>> After a rollover, the global generation will be flushed, which will >>>> cause the process mm->context.id on all CPUs do not match the >>>> generation. Thus, the process will compete for the global spinlock lock >>>> to reallocate a new ASID and refresh the TLBs of all CPUs on context >>>> switch. This will lead to the increase of scheduling delay and TLB miss. >>>> >>>> In some delay-sensitive scenarios, for example, part of CPUs are >>>> isolated, only a limited number of processes are deployed to run on the >>>> isolated CPUs. In this case, we do not want these key processes to be >>>> affected by the rollover of ASID. >>> >>> Part of this commit log should also go in the cover letter and it would> help to back this up by some numbers, e.g. what percentage improvement >>> you get with this patchset by running hackbench on an isolated CPU. >>> >>> In theory it looks like CPU isolation would benefit from this patchset >>> but we try not to touch this code often, so any modification should come >>> with proper justification, backed by numbers. >>> >> Yes, CPU isolation will benefit from this patchset. We use cyclictest tool >> to test the maximum scheduling and interrupt delays, found that the >> sched_switch process takes several microseconds sometimes, The analysis >> result shows that the delay is caused by the ASID refresh. > > Do you know whether it's predominantly the spinlock or the TLBI that's > causing this (or just a combination of the two)? > I think the spinlock is the main factor, I didn't distinguish how much time it took for each of the two. On the other hand, the TLBI is processed under the spinlock currently, its time-consuming will increase the time-consuming of the spinlock too. > I was talking to Will and concluded we should try to reuse the ASID > pinning code that's already in that file rather than adding a new > bitmap. At a high level, a thread migrating to an isolated CPU can have At first, I want to reuse the ASID pinned bitmap too, which is the same idea with you. but there is a difference between pinned bitmap and isolation bitmap, the pinned bitmap will not be changed when the generation roll-over, while the isolation bitmap need to be flushed. The idea "broadcast a TLBI for the pinned ASID when the task dies" you mentioned below maybe can reuse the pinned bitmap. I've considered this idea too, I think this method is not as good as the current two bitmap method: 1. This will introduce some TLBI jitter, and maybe increase the contention of spinlock when updating the pinned bitmap, which we don't want the jitter on the isolation CPU. 2. Another disadvantage is that if only one pinned bitmap is used and a large number of processes are on the isolation domain but the processes are not dead, the available ASIDs are insufficient. for example, more than 65536 processes running or sleeping on the isolation CPU, how to handle this situation? > its ASID pinned. If context switching only happens between pinned ASIDs > on an isolated CPU, we may be able to avoid the lock even if the > generation rolled over on another CPU. > > I think the tricky problem is when a pinned ASID task eventually dies, > possibly after migrating to another CPU. If we avoided the TLBI on > generation roll-over for the isolated CPU, it will have stale entries. > One option would be to broadcast a TLBI for the pinned ASID when the > task dies, though this would introduce some jitter. An alternative may > be to track whether a pinned ASID ever run on a CPU and do a local TLBI > for that ASID when a pinned thread is migrated. > > All these need a lot more thinking and (formal) modelling. I have a TLA+ > model but I haven't updated it to cover the pinned ASIDs. Or, > alternatively, make the current code stand-alone and get it through CBMC > (faking the spinlock as pthread mutexes and implementing some of the > atomics in plain C with __CPROVER_atomic_begin/end). >