Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp314835pxv; Thu, 8 Jul 2021 03:13:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzf//gRM/Vs99IqC035fXfPuIVd7r2vuF5vX3Qomyx1/Ik1mamFKjXIKmb7YbvhAsAYKXWY X-Received: by 2002:a05:6e02:b47:: with SMTP id f7mr21482575ilu.135.1625739205695; Thu, 08 Jul 2021 03:13:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625739205; cv=none; d=google.com; s=arc-20160816; b=XL5VPGUbYOVRIzQE/fnaTb6b14OnofLFus5PU4TI7zxFFDEli2i7rp1HGuBbtu4wfp qSF1lkbpGBes6xJ4D0e/figQ4ClRqesz4suP5JMieE7es2EB9ezS2DV18VRLKNo14ZbQ 87VIuWPszUrEYoPdhD6cIpxHwWtp2FBeRdL2NXFvrMyKfvRtVbRimZ7Jyq0fLyMzC+u5 yH8eVooNkGQjztJiB78pXCF9bqA/y2DFW09c5k8bbWpA8dXmJx1oQNgizOmJuN9iSGLQ S+mFQB5L1+oizMWsjG6JMY5BRdpn+RMGrA3OdC7Hsgo7BvSO0rJ81lo1a7g1ieHa+iCP 7naQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hYO8eT5RyGu5yME6TLfsn3cvME04291AisNrXHN42ug=; b=QnERrBhF4yKcG0XPVqCIR5vXVb83tu7uHBzZPL/ne7T3QATuJpbKW7T0uGmpNnAgrE rKt+cOP0suS4U9XcIkc1s9qu5v3qAjrPf1TH2/aAKZlMzgMJYLu2nAqhhjAxSZgDwHNo OYMEXcLVEXPNJi7YGvGp98jdKYYUgpwlvOA71ymYcGCbsVjC2YVBOfbaanWSL8ITzHwD 62y9HjCNY+oGDvf64JVIpcDLHlF8A5Tf5cyDTlZbW/nACcrB5qqUN6QSehEJNqWImx/k 2tYLNB5J6TTacHNDJBPifPISoICrsk2VPdQiOxPy2FRMbYocFH6D1YhnSn7C6S0Qe8b+ lz4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=FMqIVL5D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v22si1800475jap.98.2021.07.08.03.12.50; Thu, 08 Jul 2021 03:13:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=FMqIVL5D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231402AbhGHKPD (ORCPT + 99 others); Thu, 8 Jul 2021 06:15:03 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:43140 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231324AbhGHKPD (ORCPT ); Thu, 8 Jul 2021 06:15:03 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 7DF342235F; Thu, 8 Jul 2021 10:12:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1625739140; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hYO8eT5RyGu5yME6TLfsn3cvME04291AisNrXHN42ug=; b=FMqIVL5DsqDK/v7wUKeoFLCQta7Y8JBC5wBM0zoHpMn9spJsdGBgYE+TcXQIC3E8sxTR23 Fb7JYnB+8s91edikLw1VgqjLg8zLKtLWom1edHedxiUdafxFxLNTh1rf/hdFuWv0duzCPh IZ5RVsYP/gYJLwRDuYwVC1S2IESEl5o= Received: from suse.cz (unknown [10.100.216.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 62A64A3B88; Thu, 8 Jul 2021 10:12:20 +0000 (UTC) Date: Thu, 8 Jul 2021 12:12:20 +0200 From: Petr Mladek To: Vasily Gorbik Cc: Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Joe Lawrence , Heiko Carstens , Sven Schnelle , Sumanth Korikkar , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] livepatch: Kick idle cpu's tasks to perform transition Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 2021-07-07 14:49:38, Vasily Gorbik wrote: > On an idle system with large amount of cpus it might happen that > klp_update_patch_state() is not reached in do_idle() for a long periods > of time. With debug messages enabled log is filled with: > [ 499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running I see. I guess that the problem is only when CONFIG_NO_HZ is enabled. Do I get it correctly, please? > without any signs of progress. Ending up with "failed to complete > transition". > > On s390 LPAR with 128 cpus not a single transition is able to complete > and livepatch kselftests fail. > > To deal with that, make sure we break out of do_idle() inner loop to > reach klp_update_patch_state() by marking idle tasks as NEED_RESCHED > as well as kick cpus out of idle state. > > Signed-off-by: Vasily Gorbik > --- > kernel/livepatch/transition.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c > index 3a4beb9395c4..793eba46e970 100644 > --- a/kernel/livepatch/transition.c > +++ b/kernel/livepatch/transition.c > @@ -415,8 +415,11 @@ void klp_try_complete_transition(void) > for_each_possible_cpu(cpu) { > task = idle_task(cpu); > if (cpu_online(cpu)) { > - if (!klp_try_switch_task(task)) > + if (!klp_try_switch_task(task)) { > complete = false; > + set_tsk_need_resched(task); > + kick_process(task); First, we should kick the idle threads in klp_send_signals(). It already solves similar problem when normal threads and kthreads stay in the incorruptible sleep for too long. Second, the way looks a bit hacky to me. need_resched() depends on the currect implementation of the idle loop. kick_process() has a completely different purpose and does checks that do not fit well this use-case. I wonder if wake_up_nohz_cpu() would fit better here. Please, add scheduler people into CC, namely: Ingo Molnar Peter Zijlstra and NOHZ guys: Frederic Weisbecker Thomas Gleixner Best Regards, Petr