Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751230AbeACCii (ORCPT + 1 other); Tue, 2 Jan 2018 21:38:38 -0500 Received: from LGEAMRELO11.lge.com ([156.147.23.51]:58417 "EHLO lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751185AbeACCig (ORCPT ); Tue, 2 Jan 2018 21:38:36 -0500 X-Original-SENDERIP: 156.147.1.127 X-Original-MAILFROM: byungchul.park@lge.com X-Original-SENDERIP: 10.177.222.184 X-Original-MAILFROM: byungchul.park@lge.com Subject: Re: About the try to remove cross-release feature entirely by Ingo To: Theodore Ts'o , Matthew Wilcox , Byungchul Park , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , david@fromorbit.com, Linus Torvalds , Amir Goldstein , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, oleg@redhat.com, kernel-team@lge.com, daniel@ffwll.ch References: <20171229014736.GA10341@X58A-UD3R> <20171229035146.GA11757@thunk.org> <20171229072851.GA12235@X58A-UD3R> <20171230061624.GA27959@bombadil.infradead.org> <20171230154041.GB3366@thunk.org> <20171230204417.GF27959@bombadil.infradead.org> <20171230224028.GC3366@thunk.org> <20171230230057.GB12995@thunk.org> <20180101101855.GA23567@bombadil.infradead.org> <20180101160011.GA27417@thunk.org> From: Byungchul Park Message-ID: <6ee9ebc1-dc55-99d4-fa1f-ee9eb6084916@lge.com> Date: Wed, 3 Jan 2018 11:38:33 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180101160011.GA27417@thunk.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 1/2/2018 1:00 AM, Theodore Ts'o wrote: > On Mon, Jan 01, 2018 at 02:18:55AM -0800, Matthew Wilcox wrote: >>> Clarification: all TCP connections that are used by kernel code would >>> need to be in their own separate lock class. All TCP connections used >>> only by userspace could be in their own shared lock class. You can't >>> use a one lock class for all kernel-used TCP connections, because of >>> the Network Block Device mounted on a local file system which is then >>> exported via NFS and squirted out yet another TCP connection problem. >> >> So the false positive you're concerned about is write-comes-in-over-NFS >> (with socket lock held), NFS sends a write request to local filesystem, >> local filesystem sends write to block device, block device sends a >> packet to a socket which takes that socket lock. > > It's not just the socket lock, but any of the locks/mutexes/"waiters" > that might be taken in the TCP code path and below, including in the > NIC driver. > >> I don't think we need to be as drastic as giving each socket its own lock >> class to solve this. All NFS sockets can be in lock class A; all NBD >> sockets can be in lock class B; all user sockets can be in lock class >> C; etc. > > But how do you know which of the locks taken in the networking stack > are for the NBD versus the NFS sockets? What manner of horrific > abstraction violation is going to pass that information all the way > down to all of the locks that might be taken at the socket layer and > below? > > How is this "proper clasification" supposed to happen? It's the > repeated handwaving which claims this is easy which is rather > frustrating. The simple thing is to use a unique ID which is bumped > for each struct sock, each struct super, struct block_device, struct > request_queue, struct bdi, etc, but that runs into lockdep scalability > issues. This is what I mentioned with group ID in an example for you before. To do that, the most important thing is to prevent running into lockdep scalability. -- Thanks, Byungchul