Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752149AbdLDXPE (ORCPT ); Mon, 4 Dec 2017 18:15:04 -0500 Received: from mail-ve1eur01on0114.outbound.protection.outlook.com ([104.47.1.114]:64576 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751546AbdLDXPA (ORCPT ); Mon, 4 Dec 2017 18:15:00 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=ktkhai@virtuozzo.com; Subject: Re: [PATCH 0/5] blkcg: Limit maximum number of aio requests available for cgroup To: Jeff Moyer Cc: Tejun Heo , axboe@kernel.dk, bcrl@kvack.org, viro@zeniv.linux.org.uk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-aio@kvack.org, oleg@redhat.com References: <151240305010.10164.15584502480037205018.stgit@localhost.localdomain> <20171204200756.GC2421075@devbig577.frc2.facebook.com> <17b22d53-ad3d-1ba8-854f-fc2a43d86c44@virtuozzo.com> <20171204215234.GN2421075@devbig577.frc2.facebook.com> <6eaa11a6-a087-42ab-df65-9142b59bf726@virtuozzo.com> From: Kirill Tkhai Message-ID: <707ca8fa-aee1-f068-b8ab-de5004d3d7ac@virtuozzo.com> Date: Tue, 5 Dec 2017 02:14:54 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [89.178.229.144] X-ClientProxiedBy: HE1PR0701CA0073.eurprd07.prod.outlook.com (2603:10a6:3:64::17) To HE1PR0801MB1337.eurprd08.prod.outlook.com (2603:10a6:3:39::27) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 01e74928-ae80-452e-ef81-08d53b6cd604 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(2017052603286);SRVR:HE1PR0801MB1337; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1337;3:OKB7lnW1oTSDFs29IrEOAR+K7xseVnOeiL+BT0PZ0Fno2OW51lXUM++E/rNwh8KxV976Ith8kWRlB7qjEIYhrcXsPdfhNiUk2W4KxvQiZMhnsfoQnIlIOKb5ZjnVrXHg2w2griW/PzsGzgj7w5yVUUCvEca3Xci6X4fGapmB85W2G1jVtvcGFhxpV+SpG8aTJvo4pt9dDM0pmG//j+GAQqvVfLkvH7eOXBXaBs6OTyUjWUus6UkBmSltK+NA9DzR;25:8PM51ia3CI0EztVEvY05SFiXyCZySoGpfLvZ87Ws4+ehPoZL0SmfZhw+CkhoRqGUJk1y831bQd+sbiY4rnQTPKjme6j4C1cU3eUsqSjgx398NCErlN3VzENSZP/PKm1zsAO9FnYQfEND0D8flYClP6Wgybx+/Yg6SQS3jqlfUMC99BvZodRSjTZhexpErNmr0RgBKZ39NpEvcdWye+NruhmKJr1aGo5/vMuqZWG3G/80DDk9q5HRCQRhhDZNNHR0liAKB/JcowHs0wl1f1KK8SfiJaElJC9/XHI8bk4WGvUByj7oUg6IFru1WI8PnMe6bu7YgC5UFlF73CvuJO3RJA==;31:21QkTWQXzHNyW5UX0URIskcJ10eN12h+GDwKT8+1jIusCoVwhAUzqEM4CRzUZA2LS4iwCO/UbZEKA+CjIx1Ah3N0yOvQK5UnbbhMl9Dz+jdzDd3w917ia0ftrH3v5YI0eceaSbOsZMnf3NZL29s5BRtxnWG6XJhNGGBCuurxmYK41Uor00iGeRTLSn4OV+M1bO4YdEz+6o0uYy2+5iZy0euqLffkNf09xdPnBF/BGoA= X-MS-TrafficTypeDiagnostic: HE1PR0801MB1337: X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1337;20:/ylT90aUGifg+7W9YXyeZfSg6S3Gamh8m0xgrAUoucBjaC6/teY8DkswTlBAv2nfrIuMqUfClFdOFuUggkAOVOnZoF4PWGpWR7AyG80kwBpcQFaFLC+69lg0KOnAbJQ4oRshH/4b8Ebf49grQZagNeuDbGykU7zPSWMu4HI5u1ed8cOKEM4aH48a/p6R4eFgIalY/V1jH9wGg9+8pZAl9njuyN4Kgurm9QxF/2xSsaJCqm6xmtywRjHquHsuJfI6rLB6/d6iO2IIRTJWJL07k7hiJ0kHobX/pN3IDtT2vtVAxmq0P7nk5ghPWlEulNJA6dbiSqf+BVlWVOSyhA83+zfZjqQeDk1bH0HrQ088Sup6wj5NqoXsuNIAg1nC/TA5KIy5bIAYBr7orkYmE18pVWnOhKy6jr85eC8QDvfjrZ0=;4:oKCpR8uJnwzTt0zgIztkxh4TgdmG+1B26MYDBC16Zv8THZQH+ZYs/ANIcH2K4F1kAWqxXwLxY3q5MH9tW9Y8OTfN+X2sQOjeJR8LpnKOTAFAT44a43p8mVqh1sZmQmHFDmU8oncxJMeUkAma2OdAWmrk+7MLi1i6lqXkXDjhuZUX9WDMtjsGjMMsuGI79vhlTh4dYpVts5EZIKX7Jz2faUutEaN0s7aQrwE4J5tjs6uJkXplMDHkmsPMI6V7IAj2sC+CNG7bw3xWuqhAFFW+3hGPVmjIRNKrFfCvK5HCNyD7ZBMRhyiXe+aQSXysrP+K X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(17755550239193); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(2401047)(5005006)(8121501046)(3002001)(3231022)(10201501046)(93006095)(93001095)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123558100)(20161123560025)(20161123564025)(20161123555025)(6072148)(201708071742011);SRVR:HE1PR0801MB1337;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:HE1PR0801MB1337; X-Forefront-PRVS: 051158ECBB X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6069001)(6009001)(376002)(366004)(346002)(199003)(24454002)(189002)(53546010)(6486002)(478600001)(76176011)(6506006)(54356011)(229853002)(6246003)(4326008)(6512007)(25786009)(68736007)(52116002)(65806001)(65826007)(105586002)(65956001)(106356001)(66066001)(5660300001)(6666003)(33646002)(2486003)(52146003)(23676004)(83506002)(47776003)(2950100002)(6916009)(101416001)(86362001)(64126003)(36756003)(50466002)(31686004)(189998001)(7736002)(305945005)(31696002)(3846002)(6116002)(316002)(230700001)(8936002)(97736004)(53936002)(58126008)(16526018)(93886005)(2906002)(81156014)(8676002)(81166006);DIR:OUT;SFP:1102;SCL:1;SRVR:HE1PR0801MB1337;H:localhost.localdomain;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4MDFNQjEzMzc7MjM6dkFqcDBkWEtVRVVINkFabjdLVE5YMW9v?= =?utf-8?B?OHBjQk0ra0xaNGtVSnJiNWVsLzJtMUdocUwyWkViTitXaHBhWVZmR0hWQmRK?= =?utf-8?B?WktqUG5vSGdZSW9Za1lLOWFyQ25rMzMwdGttaTBRT0NYYVBwUUJKaWJWVFlk?= =?utf-8?B?QUNzdDlZckxQa2Z5NEdCM0RBYXlueWNvY0tZdWk1aUZsVGpPdVR3VDMvMlp3?= =?utf-8?B?YmMzSm5oaWRDcXRpQU80VTJJbE5KY2Vob3dWR080c040SGVzTktqWFo2a0lI?= =?utf-8?B?V1lLcXQ2blVDVEhDWE1VSGJFMnNOcVB2WWlYR0Fka25tWDJhOFlvN24rZWJP?= =?utf-8?B?aDVKa25VVEtHUUpvczBza2VWSkUrTDdHWVA2aC83QXh1TGJyeGNjcEJxaUlB?= =?utf-8?B?cWJwWVBXaXowR2JhOGRtMHJETE04cDhNbXZFcWc0Y3FSR3lQYnZHTzVscVV2?= =?utf-8?B?ME9oNndtcmcwRDRTNTdlYks0NHU2a1o2RzkwR2YvdTkxd2ZVUXN2V2dEZnZn?= =?utf-8?B?U2k5RWU2cjZVVkpVSEZtdVpEUDFWM2xkNGQrd1lMYVdMb3JtWFNNb3Mwb01j?= =?utf-8?B?aGtmL1NpK2hMb2NTZWVRdzVGWXNEVi9DRmgwK1hMQ2dKRDVhaXl4bHZxM0c1?= =?utf-8?B?eS9vM0UzUHdmLzJ5ZVNQZDY5dVJkYUw3cHBmN0dBcXg5QzNHZGxUcTh1Z2Ix?= =?utf-8?B?c0U1aURlNllxN0kwVmVManh4TkdLa1B0MVpYSzZPdWVsYW1EK2xUd3pJNFd6?= =?utf-8?B?N0lNUDlxMnhPdE1hanYxblVtVExSb21iUDRTR1BUK1RtSWNSbXdmTTVuZ2hr?= =?utf-8?B?NzRodkpQazYxRjlSMk5lWkkvZUcwWklhWkluSVg5aUJnWDg3VVZ2V1FZemRF?= =?utf-8?B?MTd2YlFWYTV4M0xLOCtiQzgzbnNWNDlDNWE5MXZYKzN5UzE1aldIc1FwV3Ny?= =?utf-8?B?WTR1cUZ3amVYMHNwT1hjdmM5Z1hyKzZsSk9mT1R0aTdKZlFZVy9kaUtmSTM2?= =?utf-8?B?eEtoalhQMWF0a1laUmphREp5MENyL3NxNG5kU0RRM1hvbVlhWTcxaXFGektz?= =?utf-8?B?c1hWSkFESFJJWkxybTZ3TWZ3dytGTHN1MitWWENUWVZlL3FXUXM5aEZtL2Vm?= =?utf-8?B?ZHBmaEI3cUs2VlFuNDBtd3FBRWRWOHkrWm03djVBY1VrS3FUU09jNlIzU1Qv?= =?utf-8?B?NmZublhWNW1lQXMvTGJOMFY4U0lCekxwQnV3V21YYXZGUzI2WTViZStrVVBI?= =?utf-8?B?TVZHMUFsaHgyQTBLMFBiWXI0RXhNZ2xmamhKTWU5d2JtZFhjVW9GamdpbnBv?= =?utf-8?B?YnN1aTkvSExIaC9YR1RHUmhnbFJxV25qRm1WMUVmaUtDZ2xWbGpPSzZJTGht?= =?utf-8?B?YVVzUUI3WExseGhpOWtmM0tHYnJGTWxjU3JWUTBjd0tWWW5RMHVockVyMTlR?= =?utf-8?B?YWNiUlU1M2I4VVNUc2VPQlVkZ0JGRDQ0dzdtTFJsdndwM0cwd1lKTVhzUDZs?= =?utf-8?B?aFdKSTZsSTlSd2V1UGFzM1dTaTBzVGZ4aDlnUW1ZNkh0OERvenpQY05hcE03?= =?utf-8?B?NlYzYnBjcEFIeXYvbFZEMnRhdExXbFNpZ1BqVEl6eldGY1JhTEtnOW13TzJB?= =?utf-8?B?K25Sc1g4R082TDM2S2RHY3BSWk90KzdZUWVja3dncHlWUTJpb2N4N1RjRDdk?= =?utf-8?B?QUk1a21HSHgyTUtRdVZldTFuTFhqZkYzVHQ2UXlzVTFiSnc3Zit3K2ROc2V1?= =?utf-8?B?dE1ZejAwcXpSdVl1bnQzQll4OFhUR3VIencwaVl1WkpEVWdJZmVQUEllR1pR?= =?utf-8?B?R3dDajJ0VTRWVjNIOTFzdlBJZTZIQ2Vrb0w3QlZKZGFSVlQ4QW9PU0lvZlBN?= =?utf-8?Q?l5QROECn2ENNI=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1337;6:pghs/4xHqH8doPQQoinR4BaDryk4sfmzHxeOEPLWLtyvbKqCQmW1GN+kR3lNuzeme/XzrKU1aELW/HboM9ykNt8zXaxftbqSZEyxRHMB1r69z96twgR9aDKXa3PMNK6Pbn93exCoeMpBlCO6yQW4rCB7kofunpcet3dzmdg2pdCRMu7cipQvW49t+B46FQ/bZaQW13yYHN2o7M9pT2CAvse+oiuo1KWEg9FVcYuCmaJ7oTp2zY9zC4/CZTUb6NL9SD4PmSrQF4fhYlEmisfp1QkpkqGGt9hFCoPaZNgbs/IW83L+xC3kjoUjcZoQuuLZ7ZPmspo24HtdZydErjsQKs0IhEyGUVz60Ly49bSTRDU=;5:tUxZI6k2gktOTS8hMGZUg4ods1yxpEa8nbMie299EuHPtZbGZjq0C3Wc17cYWnl3A/sX7UMzwKkTvJpP09J/cbdvG4sgdZ09pfpa0Pr0wnyf5ZI8F27xvvu8q5QmFKw5aiRw6NrioZoumOX1HfUG/uodgtem9Khzd2sdx6LqId8=;24:TJU/jxCmRIAwu6Xf7np3fBpGoTtsOea/EhylIKm6rnZnzTHeD/XBCo23ZzE+b70/1Mm7S795AC4lMa8qonS0mq6nxre4Kd5WtkCkSFp2Kms=;7:oarOYdrvFejfdWpruCqPbiD3+fQekC/O7j4suSboXHbIvuXV7T8qL5mzEWbg20VOc/vczah/yiWbEbJczKnKPEdPnQYQm86hPqHEQ3DSDKUIwibgjNPSwcm9ReagQJj9EcuRRWmARg+018p9NuE1hTIVFzteZSgDGPIdybQLxkYwkbjXl5E9BkixjxpKgdAdSPFaPmUptTVnk4pBwAsQnaQgs5H5TUq5KgBsHIvTc0s1ZUYyXas/Um4omVuVtxDW SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1337;20:wsWksmarI5rTFe3p1uDyj6MstyYN/GimsmCRIclijvnC+fWTXOG0LS+xlWPoruk/EsoQjWKRDwNRcy4kyEFVwYWiiqetsfG0wuTdmKZaRNs7041R7/MANmaDsh0OQ/61mAXpGZuTjMYX3JS7P/P8KDePJ9t8jKYcnsQKU9erIFg= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Dec 2017 23:14:57.2041 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 01e74928-ae80-452e-ef81-08d53b6cd604 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 0bc7f26d-0264-416e-a6fc-8352af79c58f X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1337 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2342 Lines: 49 On 05.12.2017 01:59, Jeff Moyer wrote: > Kirill Tkhai writes: > >> On 05.12.2017 00:52, Tejun Heo wrote: >>> Hello, Kirill. >>> >>> On Tue, Dec 05, 2017 at 12:44:00AM +0300, Kirill Tkhai wrote: >>>>> Can you please explain how this is a fundamental resource which can't >>>>> be controlled otherwise? >>>> >>>> Currently, aio_nr and aio_max_nr are global. In case of containers this >>>> means that a single container may occupy all aio requests, which are >>>> available in the system, and to deprive others possibility to use aio >>>> at all. This may happen because of evil intentions of the container's >>>> user or because of the program error, when the user makes this occasionally. >>> >>> Hmm... I see. It feels really wrong to me to make this a first class >>> resource because there is a system wide limit. The only reason I can >>> think of for the system wide limit is to prevent too much kernel >>> memory consumed by creating a lot of aios but that squarely falls >>> inside cgroup memory controller protection. If there are other >>> reasons why the number of aios should be limited system-wide, please >>> bring them up. >>> >>> If the only reason is kernel memory consumption protection, the only >>> thing we need to do is making sure that memory used for aio commands >>> are accounted against cgroup kernel memory consumption and >>> relaxing/removing system wide limit. >> >> So, we just use GFP_KERNEL_ACCOUNT flag for allocation of internal aio >> structures and pages, and all the memory will be accounted in kmem and >> limited by memcg. Looks very good. >> >> One detail about memory consumption. io_submit() calls primitives >> file_operations::write_iter and read_iter. It's not clear for me whether >> they consume the same memory as if writev() or readv() system calls >> would be used instead. writev() may delay the actual write till dirty >> pages limit will be reached, so it seems logic of the accounting should >> be the same. So aio mustn't use more not accounted system memory in file >> system internals, then simple writev(). >> >> Could you please to say if you have thoughts about this? > > I think you just need to account the completion ring. A request of struct aio_kiocb type consumes much more memory, than struct io_event does. Shouldn't we account it too? Kirill