Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752818AbcJJODa (ORCPT ); Mon, 10 Oct 2016 10:03:30 -0400 Received: from resqmta-ch2-12v.sys.comcast.net ([69.252.207.44]:32800 "EHLO resqmta-ch2-12v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752502AbcJJOD3 (ORCPT ); Mon, 10 Oct 2016 10:03:29 -0400 Date: Mon, 10 Oct 2016 09:03:15 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@east.gentwo.org To: Linus Torvalds cc: Al Viro , Andrew Morton , Jens Axboe , "Ted Ts'o" , Linux Kernel Mailing List , linux-fsdevel Subject: Re: [git pull] vfs pile 1 (splice) In-Reply-To: Message-ID: References: <20161007222059.GS19539@ZenIV.linux.org.uk> Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfFSe7EaDYZSV8WFq7frUNLylxj4OmkegN0ZETelmC5oPKtJGyxEjERiu41PuXYnjDmVtd292o/PFw/tjZLBBio5PEWIa+O8GUX0R0cGOBHOZzVjyMBqx YCvjIcAchncsAv1efVFhIC0Fjg1XvvLJg58ih9GKoLwIufS8WxTX3gbKeWlAskdMrioM3sO8iFrKcc1oJG741/58/8V0k/eQx/uy+IBz8HlzUkSLSXbrs/iz vSmhRbk3Bfru1W1/OFHcklX0ObYWK5a0urKeAlkFDnHlpf9/KpAIUn5eTuKyln2J9+bhKi7wO1xFfpj4FXS9ZXbdEwCb3GlObffWYHEVwEMeidkDPOaqoIwf IzR77yb/ovK5HnqnEsJV25Es5rBMbg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1763 Lines: 37 On Sun, 9 Oct 2016, Linus Torvalds wrote: > Hmm. When I enabled SLUB debugging, I also enabled DEBUG_PAGEALLOC, > because "why not". But it turns out that that may have been a mistake, > because it changes the very path that failed to no longer do that > failing access (or rather, it does it as a "probe_kernel_read()", > which traps and ignores the failure). DEBUG_PAGEALLOC significantly changes the layout of objects and thus this may no longer trigger. > I'll continue with *just* SLUB debugging on, but I thought it was > interesting how enabling more memory access debugging actually ends up > changing some really subtle code. Debugging options to memory allocation functions can change the memory layout which may cause the corruption to no longer happen or no longer happen the same way. Surely wish there would be another way. > Christoph, the problem is that something is triggering an oops or page > fault (depending on how bogus the address is) in __kmalloc() when it > does that get_freepointer_safe() thing without DEBUG_PAGEALLOC. I've > seen two different cases on two different boots, but they both were on > that one instruction that did that Hmm.. Then get_freepointer_safe may not be ok. Should not trigger any faults. > Could be elsewhere too. I saw it twice in one day which would *tend* > to mean that it's recent, but maybe I was just lucky the previous days > and didn't hit it. I haven't been able to repro it now, but maybe I > figured out one reason why my reproductions have been failing ;) Ok reading the rest of the thread it seems that we found the issue but still this get_freepointer_safe failure is not good. Do you have some more debugging output that can shed some more light on the failure of get_freepointer_safe?