Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4895978imm; Wed, 30 May 2018 14:21:17 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ+gHyJGURskWjGIwZQLwl7kBhvsbpr8tnDxQRxrWNTt2zobekjumZPP2kww9KzR7+bNdWv X-Received: by 2002:a63:a702:: with SMTP id d2-v6mr3484176pgf.246.1527715277936; Wed, 30 May 2018 14:21:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527715277; cv=none; d=google.com; s=arc-20160816; b=wnMeE/NO8ZO0ClEi9t8dJaUDsmewKmIWSODxhKwxcIkg3zXIxAKGe6DcNDitJ3n5fn eutLUj7pgWSKCgTbKlRTQkNFH5jovz73ZG9QtDwB4u6cmhtMARhsk5gqVe4ytbTswaDu v4/CQpBp6lhySWTekEUI7LvGK9uX2KyCfGzSnkZDDIEFpPYu0lgzktglAeMJnEcfblyC +4/R/3VO3aU72vqIEjBR4ksahev0fexXA79zUpSpHDa4GO4rp7ZCZmQaWIgmLvnj/50H gBw7wFI9d0MH6aL9SB4B12tPUk60nW+mAxyGJKWI8RaQmmgIo51kp1IA4oDhT+v9TQOO yOew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=+0+fZXDnQKxqVJQVTyjSpkpZqtxFAAh6il9aLS+uIqE=; b=Y33ggzYhYDSWpqqmqbtoFrLdK/Fgfe4dNgK0tivO0wHwWoUavwtr8AFsxPNBiOsz9c l1P85tiGumvEwpmGD2F8u9AfAVrGIz37rHEjVp95JQEn3gAIwOrJhxOoASx8hh0y2iet MOOb6x/3q4T0ictutnYI0qGrDWawPc9yHCIvl//uHm8pOzmQDO6ZCeQl8SFvZ5WIGobS 8c2tcF2m8W4xjEml4VReBMBwWLK9snD+LcJEtP5UAHV2JN+ppQIS42BxcYxmQxkdyzgK b/tq3+mOXJuPivzc+kJK08Y2Hu0caUAJ/aUAuZisZ9WOnYDILyqcnFjSgCCUC3rKHFPj vv9A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m9-v6si2534306pfe.128.2018.05.30.14.21.04; Wed, 30 May 2018 14:21:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932398AbeE3VUK (ORCPT + 99 others); Wed, 30 May 2018 17:20:10 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:35669 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932272AbeE3VUH (ORCPT ); Wed, 30 May 2018 17:20:07 -0400 Received: by mail-wr0-f194.google.com with SMTP id i14-v6so30902872wre.2 for ; Wed, 30 May 2018 14:20:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=+0+fZXDnQKxqVJQVTyjSpkpZqtxFAAh6il9aLS+uIqE=; b=WK7aUEvOoDbK/j5UzIxI2OsEIln9nScdPopB5DfudaRn70tRU+WbSSIXwAzrz0GhRb xdboC3G4JGLbRntRG+nTDvM9x6fdxCPBNty2p9lGzr1Qtabr//LYpQE1Is/4FcwmDXPk oudpwhyjVl5mjIICtYgNdj6IGcgeby0nWkWQwZDJ7bpr9bh8Vo8UbdbVA1V7iuDpWkau T7a8K+0jqO/g2AbrcuUNwFI8EwZQ9oo8AztDcsnd62V2BMO5hXJ/k+yh/525U56FWNkw OkyymOcEEIfv11UairFxWpM8Deoq3a9qbfa4JzQ3Z7A6hnRBqQpKTCIWZnWzKrOSb7GR qXIg== X-Gm-Message-State: ALKqPwfhYYojvQXrB/afNCGh6DM8O7IO6YTaMlXKQLptJOml6z3ORllA KA36b3EDhbLPMfrBLt9NhMU= X-Received: by 2002:adf:8701:: with SMTP id a1-v6mr3550047wra.178.1527715206099; Wed, 30 May 2018 14:20:06 -0700 (PDT) Received: from [192.168.0.105] ([46.120.250.42]) by smtp.gmail.com with ESMTPSA id b72-v6sm30765046wmf.2.2018.05.30.14.20.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 14:20:05 -0700 (PDT) Subject: Re: [PATCH 0/3] Provide more fine grained control over multipathing To: Mike Snitzer , Christoph Hellwig Cc: Johannes Thumshirn , Keith Busch , Hannes Reinecke , Laurence Oberman , Ewan Milne , James Smart , Linux Kernel Mailinglist , Linux NVMe Mailinglist , "Martin K . Petersen" , Martin George , John Meneghini References: <20180525125322.15398-1-jthumshirn@suse.de> <20180525130535.GA24239@lst.de> <20180525135813.GB9591@redhat.com> From: Sagi Grimberg Message-ID: Date: Thu, 31 May 2018 00:20:03 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180525135813.GB9591@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Folks, I'm sorry to chime in super late on this, but a lot has been going on for me lately which got me off the grid. So I'll try to provide my input hopefully without starting any more flames.. >>> This patch series aims to provide a more fine grained control over >>> nvme's native multipathing, by allowing it to be switched on and off >>> on a per-subsystem basis instead of a big global switch. >> >> No. The only reason we even allowed to turn multipathing off is >> because you complained about installer issues. The path forward >> clearly is native multipathing and there will be no additional support >> for the use cases of not using it. > > We all basically knew this would be your position. But at this year's > LSF we pretty quickly reached consensus that we do in fact need this. > Except for yourself, Sagi and afaik Martin George: all on the cc were in > attendance and agreed. Correction, I wasn't able to attend LSF this year (unfortunately). > And since then we've exchanged mails to refine and test Johannes' > implementation. > > You've isolated yourself on this issue. Please just accept that we all > have a pretty solid command of what is needed to properly provide > commercial support for NVMe multipath. > > The ability to switch between "native" and "other" multipath absolutely > does _not_ imply anything about the winning disposition of native vs > other. It is purely about providing commercial flexibility to use > whatever solution makes sense for a given environment. The default _is_ > native NVMe multipath. It is on userspace solutions for "other" > multipath (e.g. multipathd) to allow user's to whitelist an NVMe > subsystem to be switched to "other". > > Hopefully this clarifies things, thanks. Mike, I understand what you're saying, but I also agree with hch on the simple fact that this is a burden on linux nvme (although less passionate about it than hch). Beyond that, this is going to get much worse when we support "dispersed namespaces" which is a submitted TPAR in the NVMe TWG. "dispersed namespaces" makes NVMe namespaces share-able over different subsystems so changing the personality on a per-subsystem basis is just asking for trouble. Moreover, I also wanted to point out that fabrics array vendors are building products that rely on standard nvme multipathing (and probably multipathing over dispersed namespaces as well), and keeping a knob that will keep nvme users with dm-multipath will probably not help them educate their customers as well... So there is another angle to this. Don't get me wrong, I do support your cause, and I think nvme should try to help, I just think that subsystem granularity is not the correct approach going forward. As I said, I've been off the grid, can you remind me why global knob is not sufficient? This might sound stupid to you, but can't users that desperately must keep using dm-multipath (for its mature toolset or what-not) just stack it on multipath nvme device? (I might be completely off on this so feel free to correct my ignorance).