Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp30351087rwd; Thu, 6 Jul 2023 04:50:59 -0700 (PDT) X-Google-Smtp-Source: APBJJlGFP+ESn+yUp2JdmN9vURqYj3mo14dCbLWzaY++HSolXDCvvRLsABbcocUiaAw5AniTMfQe X-Received: by 2002:a17:90b:17cf:b0:263:2da2:fe9b with SMTP id me15-20020a17090b17cf00b002632da2fe9bmr1526924pjb.21.1688644259032; Thu, 06 Jul 2023 04:50:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688644259; cv=none; d=google.com; s=arc-20160816; b=muG870hKFQOZrb8b2pX1XHBuw8NjGS6fBV3B0M62/lavemP0Q63SdspiVRBL/GJaxI WWBkaNeXZGujDTcwJX62qC+YKQ6n+8unptW/YleGkiU4Pt4nfdOda20gRvoynEdmd9te 6XgUngLLv4Agy9EnJJid/+YeUGXpTvO+WerlIWXYMknBVFZl5QAxYlrniVKee8fy3wpY dXE4/PSmm8NeyGJORXCsfwlCjasYxdd2NROu0/ZlpaAKxpuOtShrPxv2F6GmKeijsGG7 sRrYLHPyWmeBPnJJ/tazkWKRl3xuOyq0tmPBf3O0zEKGOL9kE3Poc+jlUP0EQbHQ9+hU pNXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:references:in-reply-to:subject:cc:to:from :dkim-signature; bh=LuiRGI269Np92lVN6nJmdyMw4xzKbo2Z74mCs1dTD80=; fh=4+XHqQpTJvcMHBwCXQ7m8EhdCF+VUjMVzgVM5UcIvUk=; b=tv3mRmIVC4+/dKBeLU3U/4gJpyxw3iIYgP5KehIUEgEDjNlgoa/Jq7v7ZBovCRBth5 1czjHmB1+1phKxnSmL4AaqvmaVKIJA/n5hsdZEt8f3ukzXCVNosS/K3KHLv3/aKkNKA0 RGlhFvX5DE9TVkERIQ9TwPxP8D1zGN7lFO9DL2sQBqv2UxqyrfQmvP3eqhDtTHZlpHJU YHzBwAXL03LzR0hBU+XXT+zIbNgG4i46mtLQMRL4xxcnfiJt7MLZZbdcxkivFp6DJW7e 57ySS+OWI6qiSdueMngO50+EARC82/ig9jAKNVx3t1H5hFl6IOgoUnFrQ48cMMfh9Adj sz3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Y4dSImJB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s2-20020a17090aba0200b002636764674csi1401157pjr.86.2023.07.06.04.50.46; Thu, 06 Jul 2023 04:50:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Y4dSImJB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229515AbjGFLay (ORCPT + 99 others); Thu, 6 Jul 2023 07:30:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231532AbjGFLax (ORCPT ); Thu, 6 Jul 2023 07:30:53 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6C821727 for ; Thu, 6 Jul 2023 04:30:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1688643006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LuiRGI269Np92lVN6nJmdyMw4xzKbo2Z74mCs1dTD80=; b=Y4dSImJB1zfrvwl1WAs6S/aKXBRf8ADvKL1+JjO164NZmCWeCRZQIOvBKUdz9t9fJY4qKS WsY3gWeFB7KAVadbiA13opsiDCmHIYH7ZF51CdpFWDh5b40ewMLO7dWVOW3LfQNgKNJTpV XjdNrE/1thXrCXgDXWA/LpLHye7A4uA= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-665-gOsOnnx7NQOeWfagFzmsRw-1; Thu, 06 Jul 2023 07:30:05 -0400 X-MC-Unique: gOsOnnx7NQOeWfagFzmsRw-1 Received: by mail-vk1-f198.google.com with SMTP id 71dfb90a1353d-47e114de3b7so68188e0c.3 for ; Thu, 06 Jul 2023 04:30:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688643005; x=1691235005; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LuiRGI269Np92lVN6nJmdyMw4xzKbo2Z74mCs1dTD80=; b=Hf4Ucya/wKdSHxU+OwKK8f6aqfEr7Gcb+EtoABe48tn4BRY0o5oLoGV5OUOOO9rtWS d+Dnw9eXtDxNLsRhcvfLH3l0ODoVyFpxoTi50If3Hd3Wa4iv9cVJk9en4i/3YjBNkagn 9EczoeSxblbkzNzH39z2rAvdR6gJX1Lsl/FWZ7yhSjAX9fb/EzMjZe2pGC9HPHJUX5ul Olh8gBBXeZXI557dhfOnOTMqwnsrFzZ0EIIJGDTQFXHFlX+jXllp6shQ/Ifv5xj8jPKM OBiPfJNOPA9VQmrcUBp4MF2zuOc/nXPBdxkywZM/rPnGvSnGYe7bpaV60aZv1Pzx4x67 lBXg== X-Gm-Message-State: ABy/qLZzxuU78NzUaZloNSLo0r3ZE3EtD/1oCRxHnaJB425NtZzZ+T9j KKsHKLtpNr1uH/QI93KlDP9teFPchmhpBoVDII/IBeK9Kp/Z6KhikX8J347nSB2LLXCT5RmP6sm MmGGpr+mBWMeBllgjS3jFKFoU X-Received: by 2002:a67:ead2:0:b0:443:7599:d460 with SMTP id s18-20020a67ead2000000b004437599d460mr514997vso.1.1688643005211; Thu, 06 Jul 2023 04:30:05 -0700 (PDT) X-Received: by 2002:a67:ead2:0:b0:443:7599:d460 with SMTP id s18-20020a67ead2000000b004437599d460mr514990vso.1.1688643004926; Thu, 06 Jul 2023 04:30:04 -0700 (PDT) Received: from vschneid.remote.csb ([154.57.232.159]) by smtp.gmail.com with ESMTPSA id a25-20020a0ca999000000b0063645f62bdasm761336qvb.80.2023.07.06.04.29.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jul 2023 04:30:04 -0700 (PDT) From: Valentin Schneider To: Nadav Amit Cc: Linux Kernel Mailing List , "linux-trace-kernel@vger.kernel.org" , "linux-doc@vger.kernel.org" , "kvm@vger.kernel.org" , linux-mm , bpf , the arch/x86 maintainers , Steven Rostedt , Masami Hiramatsu , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Paolo Bonzini , Wanpeng Li , Vitaly Kuznetsov , Andy Lutomirski , Peter Zijlstra , Frederic Weisbecker , "Paul E. McKenney" , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Josh Poimboeuf , Kees Cook , Sami Tolvanen , Ard Biesheuvel , Nicholas Piggin , Juerg Haefliger , Nicolas Saenz Julienne , "Kirill A. Shutemov" , Dan Carpenter , Chuang Wang , Yang Jihong , Petr Mladek , "Jason A. Donenfeld" , Song Liu , Julian Pidancet , Tom Lendacky , Dionna Glaze , Thomas =?utf-8?Q?Wei=C3=9Fschuh?= , Juri Lelli , Daniel Bristot de Oliveira , Marcelo Tosatti , Yair Podemsky Subject: Re: [RFC PATCH 00/14] context_tracking,x86: Defer some IPIs until a user->kernel transition In-Reply-To: <57D81DB6-2D96-4A12-9FD5-6F0702AC49F6@vmware.com> References: <20230705181256.3539027-1-vschneid@redhat.com> <57D81DB6-2D96-4A12-9FD5-6F0702AC49F6@vmware.com> Date: Thu, 06 Jul 2023 12:29:58 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/07/23 18:48, Nadav Amit wrote: >> On Jul 5, 2023, at 11:12 AM, Valentin Schneider wr= ote: >> >> Deferral approach >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> Storing each and every callback, like a secondary call_single_queue turn= ed out >> to be a no-go: the whole point of deferral is to keep NOHZ_FULL CPUs in >> userspace for as long as possible - no signal of any form would be sent = when >> deferring an IPI. This means that any form of queuing for deferred callb= acks >> would end up as a convoluted memory leak. >> >> Deferred IPIs must thus be coalesced, which this series achieves by assi= gning >> IPIs a "type" and having a mapping of IPI type to callback, leveraged up= on >> kernel entry. > > I have some experience with similar an optimization. Overall, it can make > sense and as you show, it can reduce the number of interrupts. > > The main problem of such an approach might be in cases where a process > frequently enters and exits the kernel between deferred-IPIs, or even wor= se - > the IPI is sent while the remote CPU is inside the kernel. In such cases,= you > pay the extra cost of synchronization and cache traffic, and might not ev= en > get the benefit of reducing the number of IPIs. > > In a sense, it's a more extreme case of the overhead that x86=E2=80=99s l= azy-TLB > mechanism introduces while tracking whether a process is running or not. = But > lazy-TLB would change is_lazy much less frequently than context tracking, > which means that the deferring the IPIs as done in this patch-set has a > greater potential to hurt performance than lazy-TLB. > > tl;dr - it would be beneficial to show some performance number for both a > =E2=80=9Cgood=E2=80=9D case where a process spends most of the time in us= erspace, and =E2=80=9Cbad=E2=80=9D > one where a process enters and exits the kernel very frequently. Reducing > the number of IPIs is good but I don=E2=80=99t think it is a goal by its = own. > There already is a significant overhead incurred on kernel entry for nohz_full CPUs due to all of context_tracking faff; now I *am* making it worse with that extra atomic, but I get the feeling it's not going to stay :D nohz_full CPUs that do context transitions very frequently are unfortunately in the realm of "you shouldn't do that". Due to what's out there I have to care about *occasional* transitions, but some folks consider even that to be broken usage, so I don't believe getting numbers for that to be much relevant. > [ BTW: I did not go over the patches in detail. Obviously, there are > various delicate points that need to be checked, as avoiding the > deferring of IPIs if page-tables are freed. ]