Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753147AbbL2E6S (ORCPT ); Mon, 28 Dec 2015 23:58:18 -0500 Received: from mga02.intel.com ([134.134.136.20]:25809 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751959AbbL2E6Q convert rfc822-to-8bit (ORCPT ); Mon, 28 Dec 2015 23:58:16 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,493,1444719600"; d="scan'208";a="716631248" From: "Zhang, Tianfei" To: David Rientjes CC: "gregkh@linuxfoundation.org" , "mhocko@suse.com" , "arve@android.com" , "anton.vorontsov@linaro.org" , "kirill.shutemov@linux.intel.com" , "riandrews@android.com" , "devel@driverdev.osuosl.org" , "Wu, Fengguang" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH RESEND v2 1/1] fix a dead loop when in heavy low memory Thread-Topic: [PATCH RESEND v2 1/1] fix a dead loop when in heavy low memory Thread-Index: AQHRQEULCKvZlNarK0+B+Q+VlrdvF57fFw+AgAI3q0A= Date: Tue, 29 Dec 2015 04:58:11 +0000 Message-ID: References: <1451207691-10313-1-git-send-email-tianfei.zhang@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.17.6.105] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2062 Lines: 58 > However, it appears that the same process, dTi-lm, is still chosen for oom kill > because lowmem_deathpending_timeout has expired. > > So this looks like a problem if the constantly chosen process cannot exit. > It would have been helpful to have the stack of pid 27289 in the log to see > where it was stuck. But I think it may be unrelated to > lowmem_deathpending_timeout itself. We'd be better off selecting a > different process to kill with something like this: > > diff --git a/drivers/staging/android/lowmemorykiller.c > b/drivers/staging/android/lowmemorykiller.c > --- a/drivers/staging/android/lowmemorykiller.c > +++ b/drivers/staging/android/lowmemorykiller.c > @@ -128,11 +128,15 @@ static unsigned long lowmem_scan(struct shrinker > *s, struct shrink_control *sc) > if (!p) > continue; > > - if (test_tsk_thread_flag(p, TIF_MEMDIE) && > - time_before_eq(jiffies, lowmem_deathpending_timeout)) { > - task_unlock(p); > - rcu_read_unlock(); > - return 0; > + if (test_tsk_thread_flag(p, TIF_MEMDIE)) { > + if (time_before_eq(jiffies, > + lowmem_deathpending_timeout)) { > + task_unlock(p); > + rcu_read_unlock(); > + return 0; > + } > + /* Need to select a different process to kill */ > + continue; > } > oom_score_adj = p->signal->oom_score_adj; > if (oom_score_adj < min_score_adj) { > > But we need more information. Please make sure that > lowmem_debug_level is 1, try to get a complete kernel log, and if possible > please try to capture the stack of the process that can't exit (use > /proc//stack) before trying the above patch. Hi Rientjes: I re-test the monkey stress test on your patches, it seems better than current mainline code. The kernel log is a little big, more than 10 MB. I send to you directly. Best tianfei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/