Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752144Ab2HMLXI (ORCPT ); Mon, 13 Aug 2012 07:23:08 -0400 Received: from charlotte.tuxdriver.com ([70.61.120.58]:36143 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751507Ab2HMLXG (ORCPT ); Mon, 13 Aug 2012 07:23:06 -0400 Date: Mon, 13 Aug 2012 07:22:53 -0400 From: Neil Horman To: John Fastabend Cc: Al Viro , netdev@vger.kernel.org, David Miller , linux-kernel@vger.kernel.org Subject: Re: [flame^Wreview] net: netprio_cgroup: rework update socket logic Message-ID: <20120813112253.GA7300@hmsreliant.think-freely.org> References: <20120813015348.GZ23464@ZenIV.linux.org.uk> <502896C5.7080303@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502896C5.7080303@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2.9 (--) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3715 Lines: 80 On Sun, Aug 12, 2012 at 10:55:17PM -0700, John Fastabend wrote: > On 8/12/2012 6:53 PM, Al Viro wrote: > > Ladies and gentlemen, who the devil had reviewed that little gem? > > > >commit 406a3c638ce8b17d9704052c07955490f732c2b8 > >Author: John Fastabend > >Date: Fri Jul 20 10:39:25 2012 +0000 > > > >is a bleeding bogosity that doesn't pass even the most cursory > >inspection. It iterates through descriptor tables of a bunch > >of processes, doing this: > > file = fcheck_files(files, fd); > > if (!file) > > continue; > > > > path = d_path(&file->f_path, tmp, PAGE_SIZE); > > rv = sscanf(path, "socket:[%lu]", &s); > > if (rv <= 0) > > continue; > > > > sock = sock_from_file(file, &err); > > if (!err) > > sock_update_netprioidx(sock->sk, p); > >Note the charming use of sscanf() for pattern-matching. 's' (inode > >number of socket) is completely unused afterwards; what happens here > >is a very badly written attempt to skip non-sockets. Why, will > >sock_from_file() blow up on non-sockets? And isn't there some less > >obnoxious way to check that the file is a sockfs one? Let's see: > >struct socket *sock_from_file(struct file *file, int *err) > >{ > > if (file->f_op == &socket_file_ops) > > return file->private_data; /* set in sock_map_fd */ > > > > *err = -ENOTSOCK; > > return NULL; > >} > >... and the first line is exactly that - a check that we are on sockfs. > >_Far_ less expensive one, at that, so it's not even that we are avoiding > >a costly test. In other words, all masturbation with d_path() is absolutely > >pointless. > > > >Furthermore, it's racy; had been even more so before the delayed fput series > >went in, but even now it's not safe. fcheck_files() does *NOT* guarantee > >that file is not getting closed right now. rcu_read_lock() prevents only > >freeing and potential reuse of struct file we'd got; it might go all the > >way through final fput() just as we look at it. So file->f_path is not > >protected by anything. Worse yet, neither is struct socket itself - we > >might be going through sock_release() at the same time, so sock->sk might > >very well be NULL, leaving us a oops even after we dump d_path() idiocy. > > > >To make it even funnier, there's such thing as SCM_RIGHTS datagrams and > >descriptor passing. In other words, it's *not* going to catch all sockets > >that would be caught by the earlier variant. > > > Yes, thank you Al, I should have reviewed this more throughly. > OK clearly I screwed it up thanks for reviewing Al. How about this. > > fdt = files_fdtable(files); > for (fd = 0; fd < fdt->max_fds; fd++) { > struct socket *sock; > int err = 0; > > sock = sockfd_lookup(fd, &err); > if (!sock) { > lock_sock(sock->sk); What are you protecting with lock_sock here? The other call sites don't make use of the socket lock. Arguagbly they could, but I don't think they need to. As long as the fd doesn't go away we should be able to update the sk_cgrp_prioidx safely. Regards Neil -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/