Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751658AbbHTChs (ORCPT ); Wed, 19 Aug 2015 22:37:48 -0400 Received: from mail-bl2on0142.outbound.protection.outlook.com ([65.55.169.142]:56032 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751015AbbHTChq convert rfc822-to-8bit (ORCPT ); Wed, 19 Aug 2015 22:37:46 -0400 From: KY Srinivasan To: Jan Kara , Andrew Morton CC: LKML , "pmladek@suse.com" , "rostedt@goodmis.org" , Gavin Hu , Jan Kara , Vitaly Kuznetsov Subject: RE: [PATCH 0/4] printk: Softlockup avoidance Thread-Topic: [PATCH 0/4] printk: Softlockup avoidance Thread-Index: AQHQ2pUgHfVaZ+v40UqS2ESik+IqNJ4ULKSA Date: Thu, 20 Aug 2015 02:37:42 +0000 Message-ID: References: <1439998711-7013-1-git-send-email-jack@suse.com> In-Reply-To: <1439998711-7013-1-git-send-email-jack@suse.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=kys@microsoft.com; x-originating-ip: [2601:600:8c01:121d:bd0b:b271:648f:367a] x-microsoft-exchange-diagnostics: 1;BY2PR0301MB1653;5:LPap/WqgKWtsQdnbn4wP8xbU2nFwBlxA7RP1aypVfSU0wIkYGoQXawuzWk+rEk4SROc0TYS1qizbrWprzSdYkB5OYx6fOK/0pmh0tzRpThY1A1zUdpWO2JyhotKo2QcN3cBZjp312eZFK51wk95/Kg==;24:+T6jZY5OuCnVas4Ppi1eDtfD1zeiY3k/EnTzSzzHbuxdbcr+GaJQ85B/0hc0brlQ7BvnT8rltSuEV25tNSQM7VxvtqxomFGrxFymL5S+9+Y=;20:bTQVCrxByF3gglRTZSWOiRoKSPMnoi/x/rfaToIx/Cte4S3blronX9KBp/1ZWVHfkpbvs7xTwT6P/81d7p9Ffg== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY2PR0301MB1653; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(108003899814671); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401001)(5005006)(8121501046)(3002001);SRVR:BY2PR0301MB1653;BCL:0;PCL:0;RULEID:;SRVR:BY2PR0301MB1653; x-forefront-prvs: 0674DC6DD3 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(13464003)(377454003)(199003)(189002)(5005710100001)(2900100001)(10400500002)(575784001)(86362001)(64706001)(5001830100001)(5001860100001)(97736004)(10750500001)(86612001)(81156007)(5007970100001)(122556002)(87936001)(106116001)(10000500002)(19580405001)(189998001)(19580395003)(50986999)(5002640100001)(102836002)(92566002)(4001540100001)(74316001)(5001960100002)(62966003)(10090500001)(46102003)(77156002)(99286002)(15975445007)(5003600100002)(5001770100001)(8990500004)(54356999)(33656002)(101416001)(2950100001)(68736005)(40100003)(10290500002)(105586002)(5890100001)(106356001)(76576001)(76176999)(2656002)(77096005)(3826002);DIR:OUT;SFP:1102;SCL:1;SRVR:BY2PR0301MB1653;H:BY2PR0301MB1654.namprd03.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Aug 2015 02:37:42.8142 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR0301MB1653 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4088 Lines: 88 > -----Original Message----- > From: Jan Kara [mailto:jack@suse.com] > Sent: Wednesday, August 19, 2015 8:38 AM > To: Andrew Morton > Cc: LKML ; pmladek@suse.com; > rostedt@goodmis.org; Gavin Hu ; KY Srinivasan > ; Jan Kara > Subject: [PATCH 0/4] printk: Softlockup avoidance > > From: Jan Kara > > Hello, > > since lately there were several attempts at dealing with softlockups due > to heavy printk traffic [1] [2] and I've been also privately pinged by > couple of people about the state of the patch set, I've decided to respin > the patch set. > > To remind the original problem: > > Currently, console_unlock() prints messages from kernel printk buffer to > console while the buffer is non-empty. When serial console is attached, > printing is slow and thus other CPUs in the system have plenty of time > to append new messages to the buffer while one CPU is printing. Thus the > CPU can spend unbounded amount of time doing printing in > console_unlock(). > This is especially serious when printk() gets called under some critical > spinlock or with interrupts disabled. > > In practice users have observed a CPU can spend tens of seconds printing > in console_unlock() (usually during boot when hundreds of SCSI devices > are discovered) resulting in RCU stalls (CPU doing printing doesn't > reach quiescent state for a long time), softlockup reports (IPIs for the > printing CPU don't get served and thus other CPUs are spinning waiting > for the printing CPU to process IPIs), and eventually a machine death > (as messages from stalls and lockups append to printk buffer faster than > we are able to print). So these machines are unable to boot with serial > console attached. Also during artificial stress testing SATA disk > disappears from the system because its interrupts aren't served for too > long. > > This series addresses the problem in the following way: If CPU has printed > more that printk_offload (defaults to 1000) characters, it wakes up one > of dedicated printk kthreads (we don't use workqueue because that has > deadlock potential if printk was called from workqueue code). Once we find > out kthread is spinning on a lock, we stop printing, drop console_sem, and > let kthread continue printing. Since there are two printing kthreads, they > will pass printing between them and thus no CPU gets hogged by printing. > > Changes since the last posting [3]: > * I have replaced the state machine to pass printing and spinning on > console_sem with a simple spinlock which makes the code > somewhat easier to read and verify. > * Some of the patches were merged so I dropped them. > > Honza Thanks Jan. I would like to add that the problem described here is further aggravated in virtual machines and the solution proposed here effectively solves the problem. Regards, K. Y > > [1] > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2flkml.or > g%2flkml%2f2015%2f7%2f8%2f215&data=01%7c01%7ckys%40microsoft.com > %7c0be64449b7734417b58e08d2a8ac4215%7c72f988bf86f141af91ab2d7cd011 > db47%7c1&sdata=tIGC5%2bms890etIzVbaj3x3B3XUrgC54C79vaniZzRIY%3d > [2] > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmarc.inf > o%2f%3fl%3dlinux- > kernel%26m%3d143929238407816%26w%3d2&data=01%7c01%7ckys%40micr > osoft.com%7c0be64449b7734417b58e08d2a8ac4215%7c72f988bf86f141af91a > b2d7cd011db47%7c1&sdata=DFEq8NILXnLGTo%2fscI5zjzWrX9%2buJlj9lmo8r > ahuIt0%3d > [3] > https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2flkml.or > g%2flkml%2f2014%2f3%2f17%2f68&data=01%7c01%7ckys%40microsoft.com > %7c0be64449b7734417b58e08d2a8ac4215%7c72f988bf86f141af91ab2d7cd011 > db47%7c1&sdata=j9uJalk7Cup0q78gl8rgIIjySU0l7HIwk1AhYJ5cAd4%3d -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/