Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753129AbbG3Tf2 (ORCPT ); Thu, 30 Jul 2015 15:35:28 -0400 Received: from mail-db3on0090.outbound.protection.outlook.com ([157.55.234.90]:50400 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752017AbbG3Tf0 (ORCPT ); Thu, 30 Jul 2015 15:35:26 -0400 Authentication-Results: redhat.com; dkim=none (message not signed) header.d=none; Subject: Re: [PATCH 08/10] posix-cpu-timers: Migrate to use new tick dependency mask model To: Frederic Weisbecker References: <1437669735-8786-1-git-send-email-fweisbec@gmail.com> <1437669735-8786-9-git-send-email-fweisbec@gmail.com> <55B26E74.5040803@ezchip.com> <20150729132343.GC11554@lerouge> <55B90C40.5090000@ezchip.com> <20150730004444.GA14744@lerouge> CC: LKML , Peter Zijlstra , Thomas Gleixner , Preeti U Murthy , Christoph Lameter , Ingo Molnar , Viresh Kumar , Rik van Riel From: Chris Metcalf Message-ID: <55BA7C6A.1050602@ezchip.com> Date: Thu, 30 Jul 2015 15:35:06 -0400 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150730004444.GA14744@lerouge> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BY2PR07CA064.namprd07.prod.outlook.com (10.141.251.39) To DB5PR02MB0776.eurprd02.prod.outlook.com (25.161.243.147) X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0776;2:JTBGApIWv5c8tbnl4SGvI4rMWU6/ffhUOup7EnhLc2KFUalbp0yE0HNvsfIOHnCJP7kK2zyhlSetUxXUG9eJsYKg/3SK5GrNObXAHVviyVXZ03hW41Ns2QHvdCdDVf5uhu6DRTexP3MMG05AoRWZjUpWHFugcujY2ZnwyXljXYY=;3:1gqV6NQwhE7xS5v6CcZJnGFPexjXYOSB+k8n4jtekE6iS8cUp25tZjahMSD7yTz3CFcmv60/qo9NyAnmLmaSlwuqPXXhlt7R4L6hQI0CFXo+QhuOKu3OZF00yVVGs/FtDldOMLAhYfuYVqDQZ73yZQ==;25:AxvoI3tu/BwkNklNBoG/iIW4lN0mPlR4g1vcjPnk6Pg1XsNYlBPeyedMDZNlKdpFCVJwOW1YjcLoF5b49kv95NuS8nvFrtiT+ZNZTyF8DkpXhLvaRoa0QjS5QkrrANhqPYXMUE1B3Qftl4Le2owAmpM79AkKYdM3j+r9Sgls5X9GB7V4IwBG4GATfGSCN+hPPcbOVAB53DjMnS2Ankj7NX8QvEtDBVx5v+RFXNvFFF5cVLrl602OisGQkdzJjtEZaOyaN5RAZNAGSqEVfziQ2w==;20:ANvkWlA0DYNbyBROYdeSBehC0NGFEYT2DlESTiT2YZLzzY8S2KnS1TYXnrvxRqloP88nX92sdO3LVzvNmzlThSMsfRA6SjQavK6RLeswdWDbNNTFJRfoNv7oMTvwByGpI1fGr7LsPcrPMysiA2tioduuKI6qpqtCVDFIvd34/D8= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0776; DB5PR02MB0776: X-MS-Exchange-Organization-RulesExecuted X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:DB5PR02MB0776;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0776; X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0776;4:hdEWkxbVvxWhA5bzM8TANXCOcsv+nF1qiU1syZaARfm+e9nriKpGg1VmqMvsg6REjBhtOtvt27t1cGQ0wqOnh7V2QSy8BvWKwXC4lJTCTuhu2YhiPKdvOrFqjX4QtSXiQ9KGg7dEdpEdf/ekAbuH1flpTZPn18S1WhIk3Wj9obk+01CoXyLOo7/EMNmbcTWQRVJfY3iii1BjEjsUfVT/u8JDfFBVv6QcEBDulhgEOTtBdV9762u60dc124FWQ048cEAg64u1Gw7eECKEjhwcB3OdvPSGO+pDAfjbC4zRE/g= X-Forefront-PRVS: 06530126A4 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6009001)(6049001)(24454002)(377454003)(479174004)(5001960100002)(33656002)(86362001)(36756003)(23746002)(110136002)(40100003)(93886004)(83506001)(54356999)(47776003)(122386002)(19580395003)(230783001)(50466002)(189998001)(80316001)(42186005)(76176999)(92566002)(77156002)(50986999)(65806001)(46102003)(2950100001)(66066001)(15975445007)(4001350100001)(65956001)(1411001)(77096005)(62966003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR02MB0776;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;DB5PR02MB0776;23:ctQ4awNLHI2kHTtuHj5T+Izd0jwDSLcWGTqm4?= =?Windows-1252?Q?yN3nYrXG6ARiRQ3OElKtikQcm+IOe5655J4b7sMuD7pRY5/vAp77Mk6P?= =?Windows-1252?Q?945+HUs9foPwj6Cf/YT6aEzeioRwEeEAx1mDZpwGiEShrZKBtIN91aY7?= =?Windows-1252?Q?DspTJ+TMT8n5URhAhfU//z8HwsBMVtVzuSohlW0hWnX8h84FAUMhaTXI?= =?Windows-1252?Q?f/iWmRpnm8f6bZW1gf5NNRxp3LU4xAgpQTgxFl83EQBzcnSlZTmtOVJj?= =?Windows-1252?Q?+fHqXwRMUreRWGYWh/3rc9wqsFDn1xGK8R3ICpRF3ODCz8nfUBdromJO?= =?Windows-1252?Q?BwzFYJq0wlb1Y64MOZJQVr2hWnCP/M8BHeia3NvT8l8iQ2mguf0oeQc9?= =?Windows-1252?Q?29G35WQ5l7jniGopIornbxG1/Du24OvBv00XsFMiBOt0AZCXJFFGfuFb?= =?Windows-1252?Q?ugUaDyI/8cmKHFrgpByioM0NPzyE+cYIwiR+mXtx+9kPci9Y/52JoH5Z?= =?Windows-1252?Q?mXE9OPlJgbAfTkXgV+SKAD3aKbXA4uTEo58tk3AGbaoFvmgjO+W2Jxcp?= =?Windows-1252?Q?3AmRzzB4PflIGEtVr4IkBXQR/uPbW7tYNt4n+pfeMXBJEEBaxgqjONiu?= =?Windows-1252?Q?hPBDZ77oEtw03W1ijkK8rjYBrwef2/lf6pOfKpmPlIWvnBQ+kyn5K8j9?= =?Windows-1252?Q?NdNdO43/M00haBAV3lJIlsTvAdZLgrLEJ7WTxyONQDG6Vc6B6V51zQ+Y?= =?Windows-1252?Q?LTz/Fqd15+QtYWZb8mm+7TD9BvmxzzwlWei+8KOrrh+WGn4okIgdyi09?= =?Windows-1252?Q?yut5KZOe59O0w1rxMGJkSvnGWAv944ceSeEnFiEyE5LQT1CkzoVxD7pv?= =?Windows-1252?Q?yWT97WLvONwCmI9tj+NXB0tISU0XdZ2EJMkZFkWrCBvgbkDuEWkbGDER?= =?Windows-1252?Q?CAX3LD+HpepxY64+xzs82KR76V6qJLpo88lpLZ9lbgXVDM9fFLFxTTbq?= =?Windows-1252?Q?61z75Kg4ER+0Psg0e51csbch9tMSXf0A05mdGVoXQbML7p+0HUCkXH5c?= =?Windows-1252?Q?zWKjSa6X6evlDs=3D?= X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0776;5:cvCtHgRGOoZaDWcAaH/hU7mlBLN45uf8d+egTdgooZAPt9LcQeBVEG037JsOUekYHvBrn/xgPwjbqz6hJXgtccTO054gZzFZBsRkh6mA6AJEvRq6NzBHulPJL54O0qp7uWWTCOHDNvq+DXMlCgUFgg==;24:hp1WJZ/m3AW6qdQW8YB9tcwCIMs6G7h2A9NekZs4L0oMqdoTzIH2EHwzdNRpvNgl0PFLNbk/kscG3bbmDPx2BdzokIb3tY3FD1UXSmKckrk=;20:AI2pmIwXxjomk7aOzSBk6g7Zk06ZFufgW2KEz1G5wR13NL44cs7ongw/7Z1d7s2/T68ndAB3g1Jlq+cfFKCgjA== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jul 2015 19:35:21.2918 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR02MB0776 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3935 Lines: 72 On 07/29/2015 08:44 PM, Frederic Weisbecker wrote: > On Wed, Jul 29, 2015 at 01:24:16PM -0400, Chris Metcalf wrote: >> On 07/29/2015 09:23 AM, Frederic Weisbecker wrote: >>>> At a higher level, is the posix-cpu-timers code here really providing the >>>>> right semantics? It seems like before, the code was checking a struct >>>>> task-specific state, and now you are setting a global state such that if ANY >>>>> task anywhere in the system (even on housekeeping cores) has a pending posix >>>>> cpu timer, then nothing can go into nohz_full mode. >>>>> >>>>> Perhaps what is needed is a task_struct->tick_dependency to go along with >>>>> the system-wide and per-cpu flag words? >>> That's an excellent point! Indeed the tick dependency check on posix-cpu-timers >>> was made on task granularity before and now it's a global dependency. >>> >>> Which means that if any task in the system has a posix-cpu-timer enqueued, it >>> prevents all CPUs from shutting down the tick. I need to mention that in the >>> changelog. >>> >>> Now here is the rationale: I expect that nohz full users are not interested in >>> posix cpu timers at all. The only chance for one to run without breaking the >>> isolation is on housekeeping CPUs. So perhaps there is a corner case somewhere >>> but I assume there isn't until somebody reports an issue. >>> >>> Keeping a task level dependency check means that we need to update it on context >>> switch. Plus it's not only about task but also process. So that means two >>> states to update on context switch and to check from interrupts. I don't think >>> it's worth the effort if there is no user at all. >> I really worry about this! The vision EZchip offers our customers is >> that they can run whatever they want on the slow path housekeeping >> cores, i.e. random control-plane code. Then, on the fast-path cores, >> they run their nohz_full stuff without interruption. Often they don't >> even know what the hell is running on their control plane cores - SNMP >> or random third-party crap or god knows what. And there is a decent >> likelihood that some posix cpu timer code might sneak in. > I see. But note that installing a posix cpu timer ends up triggering an > IPI to all nohz full CPUs. That's how nohz full has always behaved. > So users running posix timers on nohz should already suffer issues anyway. True now, yes, I'm just looking ahead to doing better when we have a chance to improve things. >> You mentioned needing two fields, for task and for process, but in >> fact let's just add the one field to the one thing that needs it and >> not worry about additional possible future needs. And note that it's >> the task_struct->signal where we need to add the field for posix cpu >> timers (the signal_struct) since that's where the sharing occurs, and >> given CLONE_SIGHAND I imagine it could be different from the general >> "process" model anyway. > Well, posix cpu timers can be install per process (signal struct) or > per thread (task struct). > > But we can certainly simplify that with a per process flag and expand > the thread dependency to the process scope. > > Still there is the issue of telling the CPUs where a process runs when > a posix timer is installed there. There is no process-like tsk->cpus_allowed. > Either we send an IPI everywhere like we do now or we iterate through all > threads in the process to OR all their cpumasks in order to send that IPI. Is there a reason the actual timer can't run on a housekeeping core? Then when it does wake_up_process() or whatever, the specific target task will get an IPI to wake up at that point. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/