Received: by 10.223.176.5 with SMTP id f5csp556690wra; Fri, 9 Feb 2018 03:32:21 -0800 (PST) X-Google-Smtp-Source: AH8x224diBj96l6Lp0QVVA8qQYaDJ3CQcxs+utIIeGGPmSTYGvnQqD08RWC6/wmiA5c6YSgSCIXr X-Received: by 10.99.42.85 with SMTP id q82mr2075202pgq.285.1518175941086; Fri, 09 Feb 2018 03:32:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518175941; cv=none; d=google.com; s=arc-20160816; b=IQF3cirQSodYGNfH6nghDheZDBHk8ZrQov/TMPMamYqal1p8u4+92OdH5d4m5g4FQg jrxQLN7M1u+8pGY2M59Ijtp6LPl8MednVR4GS8RAX6b7FsVVRRUBCSVi0BMeKgmfgWPN Letnu2XSY+pUx3OACHELrbQhSAMhruRbZvC804F0sfig2iPOJBwWwDwS6zFdMLPYzCqT aseO5prAF2sX8jjlwYzSALb3IgDcAvCUD2XggWUlt42W1uwNhblPwx6ssZce3b0GfqYr f4odEQq0h691P1hZTokM2X9tWUTtOTmlavh+vZKgzeBAWMcT0j9e/G/d51w/AwvRTy3O btXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:date:message-id:in-reply-to :references:from:subject:cc:to:arc-authentication-results; bh=DGkwNr11gPOBX3wGBFXvujbfL0tLT1NQGLHA4Szvd/4=; b=SLErbbBl2AKM6w0klEiY+B9HJptuegaQfGkTqwbCP7APG3nZ1Xf+reeINB90os8sd3 fCkBIJy2wK4YBPWsPnKEKsZvFsM7Iz2kvQG4a/Otj4z6bXWfLAzy3g/2wUlfTyawXHQj l9jCKKDFr/lNOTC39V456ieacdn68t6n8Z8xNZeHMHuzQLgjN8wdToI2B09fdDzi9lUW dLOxDZalBWvTNynqV9akMPPrEJMjMlyCuenxT/6hTMP4vMCPRSmUy6Q94Hm4Wlls/y0g 85T+ocrqXWIK4h5A23h+K3yNhtoQ4GpgF44MqVgNp0WPAp9i60nylc33493VJOLDfhUr +GbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e12si1277336pgf.646.2018.02.09.03.32.06; Fri, 09 Feb 2018 03:32:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751060AbeBILbE (ORCPT + 99 others); Fri, 9 Feb 2018 06:31:04 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:59724 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750920AbeBILbD (ORCPT ); Fri, 9 Feb 2018 06:31:03 -0500 Received: from fsav301.sakura.ne.jp (fsav301.sakura.ne.jp [153.120.85.132]) by www262.sakura.ne.jp (8.14.5/8.14.5) with ESMTP id w19BUfII006487; Fri, 9 Feb 2018 20:30:41 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav301.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav301.sakura.ne.jp); Fri, 09 Feb 2018 20:30:41 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav301.sakura.ne.jp) Received: from AQUA (softbank126074156036.bbtec.net [126.74.156.36]) (authenticated bits=0) by www262.sakura.ne.jp (8.14.5/8.14.5) with ESMTP id w19BUeZ8006484; Fri, 9 Feb 2018 20:30:41 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) To: chris@chris-wilson.co.uk, linux-kernel@vger.kernel.org Cc: mingo@kernel.org, akpm@linux-foundation.org, ak@linux.intel.com, jack@suse.cz, aryabinin@virtuozzo.com, dvyukov@google.com Subject: Re: [PATCH] khungtaskd: Kick stuck processes From: Tetsuo Handa References: <20180208190753.17690-1-chris@chris-wilson.co.uk> <201802090810.DBF09356.OFMQFVFJtOHOLS@I-love.SAKURA.ne.jp> <151813172079.28809.12438916989037864311@mail.alporthouse.com> In-Reply-To: <151813172079.28809.12438916989037864311@mail.alporthouse.com> Message-Id: <201802092030.GCE64020.JVOLHFFQSOtMFO@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Fri, 9 Feb 2018 20:30:41 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Chris Wilson wrote: > Quoting Tetsuo Handa (2018-02-08 23:10:43) > > Chris Wilson wrote: > > > After spotting a stuck process, and having decided not to panic, give > > > the task a kick to see if that helps it to recover (e.g. to paper over a > > > missed wake up). > > > > Yes, we are seeing hangs at io_schedule(), but doesn't optionally allowing > > io_schedule() be replaced with timeout version (e.g. dump_page() upon timeout > > if io_schedule() was called for e.g. wait_on_page_bit()) give us more clue? > > Yes, this isn't for debugging who left the page locked (or the exact > root cause), this is just trying to allow the system to limp along > afterwards :) From personal experience, I know how easy it is to lose a > wakeup and the only thing to notice is khungtaskd shouting every 120s. Calling wake_up_process() does not sleep, does it? Then, I think you can do it using SystemTap, for SystemTap gives you ability to call exported functions at (almost) arbitrary line of (almost) arbitrary file. You can find https://events.static.linuxfound.org/sites/events/files/slides/LCJ2014-en_0.pdf for an example.