From: Stefan Priebe Subject: Re: Call trace in ext4_es_lru_add on 3.10 stable Date: Tue, 23 Sep 2014 14:23:29 +0200 Message-ID: <54216641.8090608@profihost.ag> References: <541AD93A.70203@profihost.ag> <20140918192131.GD19520@thunk.org> <541B32A1.3080706@profihost.ag> <20140918194311.GE19520@thunk.org> <541FC817.7030401@profihost.ag> <20140922164715.GB4572@thunk.org> <54206AA2.1050607@profihost.ag> <20140922202004.GF4572@thunk.org> <54212641.9010808@profihost.ag> <20140923094204.GB2359@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Theodore Ts'o , linux-ext4@vger.kernel.org, "p.herz@profihost.ag >> Philipp Herz - Profihost AG" , stable@vger.kernel.org To: Jan Kara Return-path: In-Reply-To: <20140923094204.GB2359@quack.suse.cz> Sender: stable-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org Am 23.09.2014 11:42, schrieb Jan Kara: > On Tue 23-09-14 09:50:25, Stefan Priebe - Profihost AG wrote: >> >> Am 22.09.2014 um 22:20 schrieb Theodore Ts'o: >>> On Mon, Sep 22, 2014 at 08:29:54PM +0200, Stefan Priebe wrote: >>>> Hi, >>>> Am 22.09.2014 18:47, schrieb Theodore Ts'o: >>>>> On Mon, Sep 22, 2014 at 08:56:23AM +0200, Stefan Priebe wrote: >>>>>>> That's not the whole message; you just weren't able to capture it all. >>>>>>> How are you capturing these messages, by the way? Serial console? >>>>>> >>>>>> Sorry this was an incomplete copy and paste by me. >>>>>> >>>>>> Here is the complete output: >>>>>> [1578544.839610] BUG: soft lockup - CPU#7 stuck for 22s! [mysqld:29281] >>>>>> [1578544.893450] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 >>>>> >>>>> OK, thanks, this is a known bug, where when ext4 is under heavy memory >>>>> pressure, we can end up stalling in reclaim. This message indicates >>>>> that the system got stalled for 22 seconds, which is not good, since >>>>> it impacts the interactivity of your system, and increases the >>>>> long-tail latency of requests to servers running on your system, but >>>>> it doesn't cause any data loss or will cause any of your processes to >>>>> crash or otherwise stop functioning (except for temporarily). >>>>> >>>>> It's something that we are working on, and there are patches which >>>>> Zheng Liu submitted that still need a bit of polishing, but I hope to >>>>> have it addressed soon. >>>> >>>> Thanks for your feedback. Will those patches go to stable? Any link to >>>> those patches? >>> >>> I'm not sure they will go to Stable when they are ready, because the >>> patches are somewhat complex and so they may not apply cleanly to much >>> older kernels. >>> >>> The patches under discussion (some have been applied, others hae been >>> waiting for some requested changes) can be found here: >>> >>> http://patchwork.ozlabs.org/patch/377720 >>> http://patchwork.ozlabs.org/patch/377721 >>> http://patchwork.ozlabs.org/patch/377722 >>> http://patchwork.ozlabs.org/patch/377723 >>> http://patchwork.ozlabs.org/patch/377724 >>> http://patchwork.ozlabs.org/patch/377725 >>> http://patchwork.ozlabs.org/patch/377727 >> >> hui that's a lot. Are they ALL needed to fix this? > Yes, all of them are needed. How can i get notified when they're ready / polished? Stefan >> No workaround possible? > I don't know about any. > >> What will Redhat do with their 3.10 RHEL 7 kernel? > Well, I cannot speak for RH guys but for SLES if there's a customer > request, we'll just go and backport the patches... > Honza >