Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5860155ioo; Wed, 1 Jun 2022 14:18:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGDcw01dFnx0jJpRSvgDkdHcOWXYWFau1DASaaPWiaWI8bz+KUDYgPCfBVn3gyUO/2F0qq X-Received: by 2002:a17:902:ecc2:b0:163:fe60:43cd with SMTP id a2-20020a170902ecc200b00163fe6043cdmr1338500plh.43.1654118304027; Wed, 01 Jun 2022 14:18:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654118304; cv=none; d=google.com; s=arc-20160816; b=R0xpVbH133W5xOIL8DA1k0Skf4odwKeMGAsWGChr8X6k6pRVr6QunzjouDzrAmUgNR oppacmOfZWNP/rJnHnKcoKK2orRij7Dm4gy9QGYIxH7vRPp4ctsAYqFjeC2N0nqvxNI1 8NnMGPCa7Cs277yhjtH2ophfOE/BzaVlQgHySlDVVcw15J7+p5fJdmMFyl3Va6uBZosi vm/zZARcJOUfSoYKEEqZC1jQrrF96IhzvAG+whXUZBZ0r8/EQwEkZjDKwh7bYwogS+eO eEZh/7torm2WxAEhe8VU7moUMDyI3VgmfjaaDXuL9YenT3FKAbfXbFReYCsPb6gpbLOe suBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:mime-version:user-agent:message-id :in-reply-to:date:references:cc:to:from; bh=xIliAxY6uu/bNA6cQvDu48OzqvjmNxBzQjVKsRUdyFk=; b=O+78TrsgBl8SYFSyoCl0ZXbDrNY4s2c0zELvCcmD2Bt/820aQDpEFsdbwWvxjfoIlw EdhtNyEp1s8OykGzPXrelNZmaRPrQQr9spSN2V2NnUdTWd9Jx8Gy8z8nRw6R57KPx798 d3T17iLcNQVcK3gnWGZuahp1/ZBDMEs/d53RuP1eBsR1zPePdnxDHdaTOgTulHpOGTZi Q3R27E5ZiKSHfG9+ZH0obMnHj5TgkRXm+7G5brFF/816O3w7r8q165OMtfJAiAtkOADJ QZ3fvaV2If4M7bXBmB3QwEvchIvQZJ3srf65USPlzz3XxMY4C/f2V0OryUqnNGxUXI5+ 4kVg== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id v8-20020a056a00148800b004fa9dcef7f0si3799441pfu.75.2022.06.01.14.18.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 14:18:24 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 98B1C231444; Wed, 1 Jun 2022 13:09:04 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245724AbiEaQKB (ORCPT + 99 others); Tue, 31 May 2022 12:10:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346020AbiEaQJ4 (ORCPT ); Tue, 31 May 2022 12:09:56 -0400 Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 311C557B13 for ; Tue, 31 May 2022 09:09:55 -0700 (PDT) Received: from in01.mta.xmission.com ([166.70.13.51]:46482) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nw4RG-001g9g-Vu; Tue, 31 May 2022 10:09:47 -0600 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:40316 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1nw4RF-007jjf-IH; Tue, 31 May 2022 10:09:46 -0600 From: "Eric W. Biederman" To: Miaohe Lin Cc: Ying Huang , , , , , , , , , , , , References: <20220530113016.16663-1-linmiaohe@huawei.com> <20220530113016.16663-2-linmiaohe@huawei.com> Date: Tue, 31 May 2022 11:09:01 -0500 In-Reply-To: (Miaohe Lin's message of "Tue, 31 May 2022 17:01:09 +0800") Message-ID: <87bkvdfzvm.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1nw4RF-007jjf-IH;;;mid=<87bkvdfzvm.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.174.4;;;frm=ebiederm@xmission.com;;;spf=softfail X-XM-AID: U2FsdGVkX1/wo32upDdcYiQB3D7INFrZxxd/vhQS/wg= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Virus: No X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Miaohe Lin X-Spam-Relay-Country: X-Spam-Timing: total 649 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 4.0 (0.6%), b_tie_ro: 2.7 (0.4%), parse: 0.73 (0.1%), extract_message_metadata: 10 (1.5%), get_uri_detail_list: 2.5 (0.4%), tests_pri_-1000: 12 (1.8%), tests_pri_-950: 0.98 (0.2%), tests_pri_-900: 0.80 (0.1%), tests_pri_-90: 204 (31.4%), check_bayes: 202 (31.2%), b_tokenize: 8 (1.2%), b_tok_get_all: 79 (12.2%), b_comp_prob: 2.2 (0.3%), b_tok_touch_all: 110 (16.9%), b_finish: 0.76 (0.1%), tests_pri_0: 406 (62.7%), check_dkim_signature: 0.42 (0.1%), check_dkim_adsp: 2.7 (0.4%), poll_dns_idle: 1.25 (0.2%), tests_pri_10: 1.74 (0.3%), tests_pri_500: 6 (0.9%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH v4 1/4] mm: reduce the rcu lock duration X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Miaohe Lin writes: > On 2022/5/31 14:06, Ying Huang wrote: >> On Mon, 2022-05-30 at 19:30 +0800, Miaohe Lin wrote: >>> Commit 3268c63eded4 ("mm: fix move/migrate_pages() race on task struct") >>> extends the period of the rcu_read_lock until after the permissions checks >>> are done to prevent the task pointed to from changing from under us. But >>> the task_struct refcount is also taken at that time, the reference to task >>> is guaranteed to be stable. So it's unnecessary to extend the period of >>> the rcu_read_lock. Release the rcu lock after task refcount is successfully >>> grabbed to reduce the rcu holding time. >> >> Sorry for late reply, I am busy on something else recently. > > That's all right. Many thanks for your hard work. :) > >> >> I have just read the whole thread of the original patch discussion. >> During discussion, in >> >> https://lore.kernel.org/lkml/alpine.DEB.2.00.1202241131400.3726@router.home/ >> >> a patch that is same as your one is proposed. Then in the following >> message, Eric think that the rcu read lock should be released until >> permission is checked, >> >> https://lore.kernel.org/lkml/87sjhzun47.fsf@xmission.com/ >> >> " >> At the moment I suspect the permissions checks are not safe unless >> performed under both rcu_read_lock and task_lock to ensure that >> the task<->mm association does not change on us while we are >> working. Even with that the cred can change under us but at least >> we know the cred will be valid until rcu_read_unlock happens. >> " >> >> So the rcu lock duration is enlarged in the following message. >> >> https://lore.kernel.org/lkml/alpine.DEB.2.00.1202271238450.32410@router.home/ >> >> But, after some thought, I don't think extended rcu read lock adds much >> value. Because after permission checking the permission may still be >> changed. There's no much difference. >> >> So, I have no objection to the patch itself. But you should add more >> information in patch description about why the RCU proected region is >> extended and why we can reduce it. > > Does below patch description makes sense for you? > > " > Commit 3268c63eded4 ("mm: fix move/migrate_pages() race on task struct") > extends the period of the rcu_read_lock until after the permissions checks > are done because it suspects the permissions checks are not safe unless > performed under both rcu_read_lock and task_lock to ensure the task<->mm > association does not change on us while we are working [1]. But extended > rcu read lock does not add much value. Because after permission checking > the permission may still be changed. There's no much difference. So it's > unnecessary to extend the period of the rcu_read_lock. Release the rcu > lock after task refcount is successfully grabbed to reduce the rcu holding > time. > > [1] https://lore.kernel.org/lkml/87sjhzun47.fsf@xmission.com/ > " It doesn't make sense to me. I don't see any sleeping functions called from find_mm_struct or kernel_migrate_pages in the area kernel_migrate_pages in the area of the code protected by get_task_struct. So at a very basic level I see a justification for dirtying a cache line twice with get_task_struct and put_task_struct to reduce rcu_read_lock hold times. I would contend that a reasonable cleanup based up on the current state of the code would be to extend the rcu_read_lock over get_task_mm so that a reference to task_struct does not need to be taken. That has the potential to reduce contention and reduce lock hold times. The code is missing a big fat comment with the assertion that it is ok if the permission checks are racy because the race is small, and the worst case thing that happens is the page is migrated to another numa node. Given that the get_mm_task takes task_lock the cost of dirtying the cache line is already being paid. Perhaps not extending task_lock hold times a little bit is justified, but I haven't seen that case made. This seems like code that is called little enough it would be better for it to be correct, and not need big fat comments explaining why it doesn't matter that they code is deliberately buggy. In short it does not make sense to me to justify a patch for performance reasons when it appears that extending the rcu_read_lock hold time and not touch the task reference count would stop dirtying a cache line and likely have more impact. Eric