Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755539Ab2K1OzV (ORCPT ); Wed, 28 Nov 2012 09:55:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40867 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754538Ab2K1OzT (ORCPT ); Wed, 28 Nov 2012 09:55:19 -0500 Date: Wed, 28 Nov 2012 09:55:15 -0500 From: Dave Jones To: linux-mm@kvack.org Cc: Linux Kernel Subject: livelock in __writeback_inodes_wb ? Message-ID: <20121128145515.GA26564@redhat.com> Mail-Followup-To: Dave Jones , linux-mm@kvack.org, Linux Kernel MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1390 Lines: 33 We had a user report the soft lockup detector kicked after 22 seconds of no progress, with this trace.. :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 :Call Trace: : [] redirty_tail+0x5e/0x80 : [] __writeback_inodes_wb+0x72/0xd0 : [] wb_writeback+0x23b/0x2d0 : [] wb_do_writeback+0xac/0x1f0 : [] ? __internal_add_timer+0x130/0x130 : [] bdi_writeback_thread+0x8b/0x230 : [] ? wb_do_writeback+0x1f0/0x1f0 : [] kthread+0x93/0xa0 : [] kernel_thread_helper+0x4/0x10 : [] ? kthread_freezable_should_stop+0x70/0x70 : [] ? gs_change+0x13/0x13 Looking over the code, is it possible that something could be dirtying pages faster than writeback can get them written out, keeping us in this loop indefitely ? Should there be something in this loop periodically poking the watchdog perhaps ? Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/