Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751297AbVJJW2u (ORCPT ); Mon, 10 Oct 2005 18:28:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751295AbVJJW2t (ORCPT ); Mon, 10 Oct 2005 18:28:49 -0400 Received: from ppsw-1.csi.cam.ac.uk ([131.111.8.131]:62872 "EHLO ppsw-1.csi.cam.ac.uk") by vger.kernel.org with ESMTP id S1751293AbVJJW2r (ORCPT ); Mon, 10 Oct 2005 18:28:47 -0400 X-Cam-SpamDetails: Not scanned X-Cam-AntiVirus: No virus found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Date: Mon, 10 Oct 2005 23:28:42 +0100 (BST) From: Anton Altaparmakov To: Glauber de Oliveira Costa cc: Mikulas Patocka , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, ext2-devel@lists.sourceforge.net, hirofumi@mail.parknet.co.jp, linux-ntfs-dev@lists.sourceforge.net, aia21@cantab.net, hch@infradead.org, viro@zeniv.linux.org.uk, akpm@osdl.org Subject: Re: [PATCH] Use of getblk differs between locations In-Reply-To: <20051010223636.GB11427@br.ibm.com> Message-ID: References: <20051010204517.GA30867@br.ibm.com> <20051010214605.GA11427@br.ibm.com> <20051010223636.GB11427@br.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2846 Lines: 65 On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote: > > >>If you had read the source code rather than just the comments you would > > >>have seen that this is not true. It can return NULL (see > > >>fs/buffer.c::__getblk_slow()). Certainly I would prefer to keep the > > >>checks in NTFS, please. They may only be good for catching bugs but I > > >>like catching bugs rather than segfaulting due to a NULL dereference. > > > > The check should be rather a BUG() than dump_stack() and return NULL --- I > > think it's not right to write code to recover from programming errors. > > Filesystem drivers are supposed to pass correct blocksize to getblk(). --- > > even for users it's better to crash, because user whose machine has locked > > up on BUG() will report bug more likely than user whose machine has > > written stack dump into log and corrupted filesystem --- by the time he > > discovers the corruption and mesage he might not even remember what > > triggered it. > > That was what I meant by having the opposite problem here. I think > dumping the stack and returning NULL is okay, as long as all programmers > test its return value, and decide to fail in an alternative way, just > like Anton does, for example. But unfortunately, that's not what happen. > > In a lot of cases, we see uses like these: (This one from affs.h) > > bh = sb_getblk(sb, block); > lock_buffer(bh); > memset(bh->b_data, 0 , sb->s_blocksize); > set_buffer_uptodate(bh); > unlock_buffer(bh); > > Which does not seem to be the right usage for it. > > As I said, I took away the checks because I missed that return > statement. I usually don't think that hanging is the preferred solution > in the cases in which you can stop gracefully - But in case you do stop > gracefully, not dereference a NULL pointer. > > > > > > As comment in buffer.c says, getblk will deadlock if the machine is out of > > memory. It is questionable whether to deadlock or return NULL and corrupt > > filesystem in this case --- deadlock is probably better. > > > > Mikulas > > Maybe the best solution is neither one nor another. Testing and failing > gracefully seems better. > > What do you think? I certainly agree with you there. I neither want a deadlock nor corruption. (-: Best regards, Anton -- Anton Altaparmakov (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/