Received: by 2002:a4a:311b:0:0:0:0:0 with SMTP id k27-v6csp4099353ooa; Tue, 14 Aug 2018 00:43:35 -0700 (PDT) X-Google-Smtp-Source: AA+uWPxiF+7MAnVVgp2iHmMxEJdO5ZVPFDdabOP6vhWYEdFkQTd9WahpWLFN/9Wl/c72uqm+qZKJ X-Received: by 2002:a62:6003:: with SMTP id u3-v6mr22394527pfb.114.1534232615345; Tue, 14 Aug 2018 00:43:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534232615; cv=none; d=google.com; s=arc-20160816; b=GP0Ffy/y7my0Q1prTMX9RFU8v7FxRan1VrfrSVXs6SCO5DWJNXGjZ2UMdJp8FaQ/LN zMDfZGR49+5JQMrtYQEskRm6aU3Sjdm5Wvb8A8JJJijcvLN91ebQEoCOw+H+EuiUtTQY /JD/YR2bccUquzKsX0qex2wM/UGY1hcTRDDwUYQ0NFZA6Ga8HXXJot4qeNZ5qNrQJL8Y WYZ/I3LsnkfH8beSQ5ORI8rI1PDTarZ0FlGFN1pBptJm5PXkUGqcUKDuh49y6Iwrpxb/ /acF34uusnC7Ng209XC5uncoX49c8o2t0atOgcvMfuhhbmmtNOCwyomDeI539sUCgpLG cwhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:arc-authentication-results; bh=lEqyuFDuNxs52Jw21UUOaTcRZDYH2e2pcy4ODxmeKFM=; b=b/AYwb+AyYZ/S1VxAKq1br24/N4DwP8RBiihxFRfM2LwWtodL5/FBwt84HZLbHozoW EAqEPDvh9Woj4sNMP22mZPWbxXiWWQ7+5vJsXRNfH5HW6ebYMU/ForeFvhLryEMvSNGl R7NhSpCG2m5u3qkJnjVyfdc4q6eWPj2I1Z/O7W/GlqVskbb5BrzVd2gx0THXEa18zEzI aJfKC2+mit57ELEDQg0l4LXIG4gUsiPJvFy3v7Qwa/5rz/BGmEAltGt52TDY1TtDQUO6 Qbdpp5F2kuRHolFt/9FbnN6cXQDcgKRwEfdes9KWgnrDz3MaUDQA2ZcRlEVRMvPxeDwj 2agA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u16-v6si16473711pgv.180.2018.08.14.00.43.20; Tue, 14 Aug 2018 00:43:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731886AbeHNK1i (ORCPT + 99 others); Tue, 14 Aug 2018 06:27:38 -0400 Received: from smtp.nue.novell.com ([195.135.221.5]:51798 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727830AbeHNK1h (ORCPT ); Tue, 14 Aug 2018 06:27:37 -0400 Received: from [10.160.4.48] (charybdis-ext.suse.de [195.135.221.2]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Tue, 14 Aug 2018 09:41:37 +0200 Subject: Re: [RFC PATCH 00/17] btrfs zoned block device support To: "Austin S. Hemmelgarn" , dsterba@suse.cz, Naohiro Aota , David Sterba , linux-btrfs@vger.kernel.org, Chris Mason , Josef Bacik , linux-kernel@vger.kernel.org, Damien Le Moal , Bart Van Assche , Matias Bjorling References: <20180809180450.5091-1-naota@elisp.net> <20180813184251.GC24025@twin.jikos.cz> <86bddb14-104e-182b-29a1-6ab8150f09a8@suse.com> <057b6600-0fef-4067-54ca-216b591d43f8@gmail.com> From: Hannes Reinecke Message-ID: <9531d57f-2271-7eb8-b734-dac6d33f0ec1@suse.com> Date: Tue, 14 Aug 2018 09:41:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <057b6600-0fef-4067-54ca-216b591d43f8@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/13/2018 09:29 PM, Austin S. Hemmelgarn wrote: > On 2018-08-13 15:20, Hannes Reinecke wrote: >> On 08/13/2018 08:42 PM, David Sterba wrote: >>> On Fri, Aug 10, 2018 at 03:04:33AM +0900, Naohiro Aota wrote: >>>> This series adds zoned block device support to btrfs. >>> >>> Yay, thanks! >>> [ .. ] >>> Device replace is disabled, but the changlog suggests there's a way to >>> make it work, so it's a matter of implementation. And this should be >>> implemented at the time of merge. >>> >> How would a device replace work in general? >> While I do understand that device replace is possible with RAID >> thingies, I somewhat fail to see how could do a device replacement >> without RAID functionality. >> Is it even possible? >> If so, how would it be different from a simple umount? > Device replace is implemented in largely the same manner as most other > live data migration tools (for example, LVM2's pvmove command). > > In short, when you issue a replace command for a given device, all > writes that would go to that device are instead sent to the new device. > While this is happening, old data is copied over from the old device to > the new one.  Once all the data is copied, the old device is released > (and it's BTRFS signature wiped), and the new device has it's device ID > updated to that of the old device. > > This is possible largely because of the COW infrastructure, but it's > implemented in a way that doesn't entirely depend on it (otherwise it > wouldn't work for NOCOW files). > > Handling this on zoned devices is not likely to be easy though, you > would functionally have to freeze I/O that would hit the device being > replaced so that you don't accidentally write to a sequential zone out > of order. Ah. Oh. Hmm. It would be possible in principle if we freeze accesses to any partially filled zones on the original device. Then all new writes will be going into new/empty zones on the new disks, and we can copy over the old data with no issue at all. We end up with some partially filled zones on the new disk, but they really should be cleaned up eventually either by the allocator filling up the partially filled zones or once garbage collection clears out stale zones. However, I fear the required changes to the btrfs allocator are beyond my btrfs knowledge :-( Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)