Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753604AbbDNIMM (ORCPT ); Tue, 14 Apr 2015 04:12:12 -0400 Received: from mail-db3on0086.outbound.protection.outlook.com ([157.55.234.86]:33888 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753347AbbDNILw (ORCPT ); Tue, 14 Apr 2015 04:11:52 -0400 Authentication-Results: spf=none (sender IP is 193.47.165.134) smtp.mailfrom=mellanox.com; vger.kernel.org; dkim=none (message not signed) header.d=none; Message-ID: <552CCBB1.6000207@mellanox.com> Date: Tue, 14 Apr 2015 11:11:29 +0300 From: Haggai Eran User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Yann Droneaud CC: Shachar Raindel , Sagi Grimberg , "linux-rdma@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: CVE-2014-8159 kernel: infiniband: uverbs: unprotected physical memory access References: <1427969085.17020.5.camel@opteya.com> <1427981431.22575.21.camel@opteya.com> <551D5DC8.6070909@mellanox.com> <1427992506.22575.80.camel@opteya.com> , <1427998401240.52348@mellanox.com> <1428931781.22575.232.camel@opteya.com> In-Reply-To: <1428931781.22575.232.camel@opteya.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.0.52.254] X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:193.47.165.134;CTRY:IL;IPV:NLI;EFV:NLI;BMV:1;SFV:NSPM;SFS:(10009020)(6009001)(428002)(479174004)(189002)(51704005)(199003)(24454002)(87266999)(77096005)(6806004)(93886004)(101416001)(230783001)(87936001)(50466002)(83506001)(46102003)(33656002)(54356999)(64126003)(86362001)(92566002)(77156002)(106466001)(47776003)(50986999)(110136001)(65956001)(80316001)(76176999)(105586002)(36756003)(2950100001)(4001350100001)(62966003)(23676002)(3940600001);DIR:OUT;SFP:1101;SCL:1;SRVR:AM3PR05MB531;H:mtlcas13.mtl.com;FPR:;SPF:None;MLV:sfv;MX:1;A:1;LANG:en; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM3PR05MB531; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(5002010);SRVR:AM3PR05MB531;BCL:0;PCL:0;RULEID:;SRVR:AM3PR05MB531; X-Forefront-PRVS: 054642504A X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2015 08:11:47.7575 (UTC) X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a652971c-7d2e-4d9b-a6a4-d149256f461b;Ip=[193.47.165.134];Helo=[mtlcas13.mtl.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR05MB531 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3739 Lines: 84 On 13/04/2015 16:29, Yann Droneaud wrote: > Le jeudi 02 avril 2015 à 18:12 +0000, Haggai Eran a écrit : ... >> >> I want to add that we would like to see users registering a very large >> memory region (perhaps the entire process address space) for local >> access, and then enabling remote access only to specific regions using >> memory windows. However, this isn't supported yet by our driver. > > In such scheme, the registration must still be handled "manually": > one has to create a memory window to get a rkey to be exchanged with > a peer, so why one would want to register such a large memory region > (the whole process space) ? > > I guess creating the memory window is faster than registering memory > region. Right. It takes time to create and fill the hardware's page tables. Using memory windows allows you to reuse the work done previously, while still having a more granular control over the RDMA permissions. The larger MR can be created with only local permissions, and the memory window can add specific remote permissions to a smaller range. The memory window utilizes the page tables created for the memory region. > I'd rather say this is not an excuse to register a larger > memory region (up to the whole process space, current and future) as it > sounds like a surprising optimisation: let the HCA known too many > pages just to be sure it already knows some when the process want to > use them. It seems it would become difficult to handle if there's too > many processes. Are you worried about pinning too many pages? That is an issue we want to solve with ODP :) > > I would prefer creating memory region becoming costless (through ODP :). I agree :) > >> Still, there are valid cases where you would still want the results >> of an mmap(0,...) call to be remotely accessible, in cases where there >> is enough trust between the local process and the remote process. > > mmap(0, ...., fd) let the kernel choose where to put the file in > process virtual memory space, so it may, may not, partially, end up in > an ODP pre registered memory region for a range unallocated/unused yet. > > I don't think one want such to happen. I think that in some cases the benefit of allowing this outweigh the risks. This is why it is an opt-in feature. > >> It may help a middleware communication library register a large >> portion of the address space in advance, and still work with random >> pointers given to it by another application module. >> > > But as said in the beginnig of your message, the middleware would have > bind a memory window before posting work request / exposing rkey for > the "random pointers". > > So I fail to understand how could be used ODP when it comes to > registering a memory region not yet backed by something. In this scenario, the middleware would first register the full address space as an ODP memory region with local permissions only. When it wants to provide remote access to some buffer, it would bind a memory window over the ODP MR. This is possible with multiple processes because it uses the virtual memory system without pinning. It won't cause random mmap regions to be mapped for RDMA without the specific intent of the application. However, we currently don't have support for memory windows over ODP MRs. Even if we did, there is some performance penalty due to binding and invalidating memory windows. Some applications will still need full process address space access for RDMA. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/