Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753070AbbEHSsx (ORCPT ); Fri, 8 May 2015 14:48:53 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:49008 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751185AbbEHSst (ORCPT ); Fri, 8 May 2015 14:48:49 -0400 Message-ID: <554D0502.1080808@fb.com> Date: Fri, 8 May 2015 12:48:34 -0600 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: "Martin K. Petersen" CC: Jeff Moyer , , , , Subject: Re: [PATCH v2] Support for write stream IDs References: <1430856181-19568-1-git-send-email-axboe@fb.com> <55493097.6040007@fb.com> <55493AC1.9090408@fb.com> <554A4DB7.7070202@fb.com> In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000 definitions=2015-05-08_07:2015-05-08,2015-05-08,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5176 Lines: 101 On 05/07/2015 01:19 PM, Martin K. Petersen wrote: >>>>>> "Jens" == Jens Axboe writes: > > Jens> This wont solve the problem of devices having too few streams. But > Jens> it'll work regardless, we'll just have to push them separately to > Jens> do that. It's not an easy problem for them either, resource > Jens> constraints on the device side could exclude supporting as many > Jens> streams as we would ideally want. > > But they already have to manage *every* other resource that way: Read > cache, write cache, flash channels, open zones on ZAC/ZBC. If they run > out of memory and have to internally close one stream context to open > another that's their business. If the concurrent ID count is low, > performance their particular widgets is going to suck for some > applications and people will avoid them. Boo hoo. There are actual technical challenges on the device side that sometimes interferes. Say you currently implemented your flash device as a log based structure. The only way to even support streams is to have more logs you can append to. So perhaps then you can't support streams, boo hoo for them. Maybe you don't have a fixed log, you can write anywhere. But you can only have so many erase blocks open at one time. Not a huge concern, you have to manage that and open/close streams as you need to. That's basic resource management. But even if you do that, erase blocks are not small these days, even for big devices there are only so many of them (we're talking thousands in total, not millions). There's a very real lower upper bound there on what can be supported. It's easy enough being on the device side and punting everything to the OS. Why wouldn't you? Then it's out of your hair. At the same time, on the other side, there's also an OS tendency to whine and want everything, and helpfully all the time. The reality is that we can't demand that devices support thousands of streams. It'd be nice if they did and we didn't have to care at all, but realistically, that is not going to happen nor is it a completely sane demand. And while that may not be perfect, it's still a worthwhile improvement and wont preclude a hopefully rosier future where we have more streams. Lets say we have 8 streams now. We need some sort of policy to multiplex those streams. That's the current challenge. I can add the kernel managed streams, and I'll do that. > I'm super happy the SSD industry (well, the market) came to its senses > and abolished all the outrageous demands put on the I/O stack to > overcome erase block size and write amplification issues. Now all that's > a solved problem and we can move on. I would not say write amplification is a "solved issue", in fact we're attempting to improve it with this :-). But I know what you mean, and yes, that was a sad situation. I don't think it's a fair comparison, though. > Next problem child was the host managed zoned disk madness. Yet another > device implementation headache that suddenly requires us to reinvent > filesystems and the entire I/O stack. > > Next in the pipeline is the stream ID stuff. Which once again puts the > burden on us to overcome device implementation issues and misunderstands > how operating systems work. Again, I don't think that's a fair comparison. Write streams are useful. And adding support for write streams, even in a limited fashion, can be directly extended when/if more stream IDs are available. The only change would be in the management policy. The basic premise of open stream, use stream, close stream - those would be the same. I get that you don't like that we need to manually open and close streams, but honestly, those are useful hints to the device, even if they don't need to do anything about them explicitly. > There are two fundamental problems: > > - The standards are developed by device vendors with little to no input > from the OS vendors > > - The standards proposals are written, edited, and declared complete > before anybody actually tries to implement them > > That's how we end up with all these lame duck spec extensions that are > device implementation-specific and impossible to use generically. The write streams proposal was already approved by t10... > There are many, many reasons why stream IDs are a good thing. Above and > beyond what the current proposals want. The notion of tagging is a much > better abstraction than bootiness and guessing a percentage for how > sequential future accesses might be. It's a simple, clean interface that > the device--regardless of media type and implementation--can benefit > from. Exactly, which is why the streams is the first hinting mechanism that I actually think can work. The previous stuff has been utter crap, and I've always ignored it for exactly that reason. If we want/need policy on top of the streams, we can implement that independently. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/