Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933640AbcKWTOx (ORCPT ); Wed, 23 Nov 2016 14:14:53 -0500 Received: from mail-dm3nam03on0071.outbound.protection.outlook.com ([104.47.41.71]:41557 "EHLO NAM03-DM3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750857AbcKWTOu (ORCPT ); Wed, 23 Nov 2016 14:14:50 -0500 X-Greylist: delayed 81621 seconds by postgrey-1.27 at vger.kernel.org; Wed, 23 Nov 2016 14:14:50 EST Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Serguei.Sagalovitch@amd.com; Subject: Re: Enabling peer to peer device transactions for PCIe devices To: Jason Gunthorpe , Logan Gunthorpe References: <75a1f44f-c495-7d1e-7e1c-17e89555edba@amd.com> <45c6e878-bece-7987-aee7-0e940044158c@deltatee.com> <20161123190515.GA12146@obsidianresearch.com> CC: Dan Williams , "Deucher, Alexander" , "linux-nvdimm@lists.01.org" , "linux-rdma@vger.kernel.org" , "linux-pci@vger.kernel.org" , "Kuehling, Felix" , "Bridgman, John" , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , "Koenig, Christian" , "Sander, Ben" , "Suthikulpanit, Suravee" , "Blinzer, Paul" , "Linux-media@vger.kernel.org" , Haggai Eran From: Serguei Sagalovitch Message-ID: <7bc38037-b6ab-943f-59db-6280e16901ab@amd.com> Date: Wed, 23 Nov 2016 14:14:40 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161123190515.GA12146@obsidianresearch.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [165.204.55.251] X-ClientProxiedBy: SN1PR08CA0042.namprd08.prod.outlook.com (10.161.221.52) To BY2PR12MB0696.namprd12.prod.outlook.com (10.163.113.18) X-Microsoft-Exchange-Diagnostics: 1;BY2PR12MB0696;2:tDojpcTXbmQGQ7D1IO+ggwNtVa/Gz7RhbFpxtE/JVl2wBYoyNeJgzbi/xfID6ifMuO8Q6RhTu/yLgxzfIiKo+R1egOsYct/PR9qssSgsJ/GpSLR6aLtwNaQxXzC7OB6rN8+7YQTZi8V7iUFXOYibN2Mib1eG6oClUioI2AzL+Hg=;3:/SA6jDXC7mqqvHT9InKehLPe3xy1/cNmQtRv4XG/euSpG/QyyLwyAwI5xisFqye6gTFOILSMo6H22J8iUTFZwfk8/gtbRAUceWZJ9SKbF7b44SHsaQgwl7uEVoFqEUP3V6N60+tVIvE+RhxJOA/XQ7wsmdmTxQeOdVHdB29iW5U=;25:GFvg7K9jiCpxt6MseMKxnxtOhcgGcbRHwJWAozElJZLdXu73l5Z3++u9XH/qcCgRIOPJhdvZgZXFRpFrKnb1ch4yszpUGJleGy3FQNUFsXhL5qFkiv+Ei0RD5ajjUQVdILKDZsk6vOF1TeTZvPLAGg4rXLmkXiFTIwNqH5qcME2yelqnaDrZUU9Vw/XCZds84HHaBMsTn/oN50BasZ6w9VJy4kV7oGI6oCsSX0CNnNNWXCYxU0ll3q++Ah9RFzsC6uYQykk5Dm14q+gER0HSw9+ZFM/OHY/VwxXi5XnsCsPCqDC8jd9ch0BgUIVBfoBCv9heRJmzUwWl3NRMPDIJrpmfAlYJjN7ZEZ+C/ilr2g22MFB4WbXXoX1dhBTBofaA+N5UY+kdPET5xbiPdPt5QMbL/XA8EyT8+UoQsmurvcEOxChyl74beWrhIo2z0NI5RSXivZfZWreZnpujlZW/rQ== X-MS-Office365-Filtering-Correlation-Id: c641dc58-6afe-4e58-1058-08d413d4fd6b X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:BY2PR12MB0696; X-Microsoft-Exchange-Diagnostics: 1;BY2PR12MB0696;31:cJ69pVgjk6T3oUe5oPBCjoMsbBNLc1/1GmZlWPwIjio4XnuZ3g0Q0rN8YBLS+51wLupSb0+0jGfDQU53v+qFW1knRH5ZW2x2wCdHyrC3wgLJHeAgQPQ20rf2tPXhnSylBVZLlz7aie3NmmdAXEGdGMJkYKntQx+F+pDGICFcK0mXbn5/B4ZqG/RO8jjWDPSMjI5nInjoc8FhV6mO7N2dGUE2a/13mMyDAEHRjgAd76ORAFkkJil2FFpv7IrrG9cGmx0M8Z+0iJZ+MPNQs2RC9eRQ5mv4csVhYDX240LJ+XU=;20:WwCP4vtQWrnRXeDRUCDdM641/0F6mEx++YF4v+YAYwefG88viVcg4rfFNjVuYzwKHh9nhsjEvYVezCEXkCSXR3wqmGkX1jOBvHdiiEuwCgiNAOepebhD8SZGw+V1uqu1HgHEwbaaXcyQZJXLeqKbVrex9OcaEe8q3FkQbXsBVcZchf9CAKVinjjaRCNNTfaJOHDPSz8F4J+AnlveNB6YkQHhqtMo9DKx9F2DY0Jct96/ZptSkmt/+dp+ChKCH6VU4VMcGdGvmEaDSh6eZt2+EAgLmmSKnOj/QzkuJpvlKrdqPA/2vKXL8jGgJzj0pzu/T8eGF6vAYnSEHqvEKlIqIMFoqlqbNG1ajttHY45omtKi0XhxQas+d7r99+672rYhyiE+S9tQjEFMCTwGMn6TnrtcQEBLBbqS/o1CvTzWwIHOQLRUXdnzU5DPnWWELdsmVpWu/Ow8Z4gWFjD7ISXyep0oYcGwilrh53EUx9i+U5+hhwS0ZY+T29aKnbDU3tSC X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6060326)(6040307)(6045199)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026)(6041248)(6061324)(6042181);SRVR:BY2PR12MB0696;BCL:0;PCL:0;RULEID:;SRVR:BY2PR12MB0696; X-Microsoft-Exchange-Diagnostics: 1;BY2PR12MB0696;4:6pV47MANCTfCR2Ra94xHMj44W4banQ/DYVIAyVJ1k8ciknnPNOLh3FDrDkkYXHeS5yFd2f7x4SAGzAaq9DNedATbzTqWUKsBL1lDkEZ+JixcsyKMZKc4+F2J4JE78fHDJM8HIrGyTUzvBlE1mPaB3i+CqXg5PRERj58STSD3PAjS05dunMxK3DbiScz8xER9a95MbD3YD4KFwsi8sz2AczW81fY6SBMZn+rhVE0f1AFk+NADXFqadCFxUAw7Syv88xtCAjJXxMFD29jBaKHWo80b3jobQ8L+lcw0R7DV5aj77AGt8PbrWNbj/ar79fQNUIXaaVrmf2vQU5YtjuyC0z/6r/tBigFsuQNHpBdOSxRCs/Bf4eyK4G6V6/hEnePOmzduVMJTChBBV09yqY5NAEuH1ZkclOgGWTfBYblzYKXMSSGeop1J7Mt29mANiVLrqvrxlmM2JQL76KklqNnrDyQaIGK/lAlYvZtAfS1NepjjXqlvmM1zWpGbWhrD9sY7qjfk/N2s6bIhV4cRNpZaPQ== X-Forefront-PRVS: 013568035E X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6009001)(6049001)(7916002)(189002)(377454003)(199003)(24454002)(377424004)(106356001)(65806001)(31686004)(3846002)(42186005)(81166006)(81156014)(8676002)(7416002)(36756003)(65956001)(68736007)(47776003)(66066001)(50986999)(76176999)(6116002)(65826007)(54356999)(5660300001)(83506001)(64126003)(50466002)(229853002)(7736002)(4001150100001)(189998001)(305945005)(97736004)(93886004)(2906002)(5001770100001)(4001350100001)(7846002)(230700001)(77096005)(86362001)(33646002)(105586002)(92566002)(4326007)(23746002)(38730400001)(31696002)(101416001)(2950100002)(6666003);DIR:OUT;SFP:1101;SCL:1;SRVR:BY2PR12MB0696;H:[172.27.224.67];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;BY2PR12MB0696;23:O8L+cysVDx6Ba7KyACsy2V1u3G68bPbFPtMDc?= =?Windows-1252?Q?IKYFyzW3yULDAejjMYbQIQljO9jJvHGJp+tKBtn+ON5iSXw8Uk2f0sM6?= =?Windows-1252?Q?/FI9rX1uy6MF794O/ZzdlTiwxy1useXQrhsCmzxtvb+GOfmkiMcqqd7z?= =?Windows-1252?Q?yb4IXtyVLJcm1qFsE8/N4b/QUyAaybttz1YWzaC3f+KrvwOj5a3MKtkl?= =?Windows-1252?Q?/3/5hplGtmtx6Fnvll/hMNa78Wqm0OsgnzrkFIurejb8Iu/6/w/giA1T?= =?Windows-1252?Q?DR5XFkX7OtDwFxyEWOnBAzfQgiX3lGzNZe8ZgxZhDzmrgQut9YlOANrF?= =?Windows-1252?Q?LT/ETCi0PXBsEY7b3Fxh1Q10CIpdi2UVp8end9C7uJcREcct0KPqDihn?= =?Windows-1252?Q?wGvT0uOsR+LZyMDu3ci2JZEsr0AOmXi5Ro064wDMyfBWkhHa4TR1/Mi4?= =?Windows-1252?Q?58un1r/I4YPY1A4c/R4Om4gq+mDT3Wz8enohXRxEz4wxA9ySIZisbh9n?= =?Windows-1252?Q?ZXG3i9XdKB1qkP1UrtexMdj2g6oYeUaJTH65NlFiITJktZZpW2nKZ+SC?= =?Windows-1252?Q?Ht5KJYgOVbfk42dBB27XTFtXY7ZQRj7WJaQ4FpGlMiUbHxDh+t2Vm6gY?= =?Windows-1252?Q?n7qPevvwBeNiOfQ3irja2PMhW62uNgNBPZX2r1aUJ1k87lCRx26Tg957?= =?Windows-1252?Q?fDEkddkPmCtBA7k88TIyL2PMXM0W3EPWVR4Kvv99+JRcToCOmvAs8RxN?= =?Windows-1252?Q?50WSy35iDkTd9Y+V6Naxjim62SWe0yVM9smhw9ps00mDWO288rK78b0R?= =?Windows-1252?Q?RC6ip2gr5fu0OGqTp5LwTVPFBYBAYq0eZx0MSEOqjWc7RwRWxCb9qKaY?= =?Windows-1252?Q?k20DMCtgPNuRMgAiDvnbPa32VmHZ9bblDe0qEQwzOdgNBFYne/VMM/GE?= =?Windows-1252?Q?2n6RGe2IyDlkaPeJ2JqGLsEo0q74ZR2rhWQo6bQxtJrMPdjNNpvOPPEV?= =?Windows-1252?Q?k1gdUXtxxgVRSBop6F+NGs6dyFRfc6OAbl/BrijoPb+X5Ieps1V2WCoM?= =?Windows-1252?Q?Ud1b155CZwJFNiRKPc0POe5oLz75AAqOyCC1R15lmR7nhxIcBbsEmrmJ?= =?Windows-1252?Q?1+fNTaExjFs7s5izNcN2XIv/Yt0ZJrJln6ajVhxnMzoRQbdIPwU9Mn93?= =?Windows-1252?Q?nZWclcnCpnB718AIxCxY1SUsZUYgOOj2Ry2VBXGOUWBML4uNPQHFAS02?= =?Windows-1252?Q?i8y8JVExwpD32bS0Wdaux+HllPWSrSuSEPLK8awhl5eTfKosO0j7ja6A?= =?Windows-1252?Q?5o/6gLg470W1+9fs+i5EeNRrZnUS3vnGjujjXcb+LY0fYDCUKVtRt5y5?= =?Windows-1252?Q?LTFaB+8OBcBP90yTntSKAzoxzB17eTnQmuflUk40wVPoZUQX5ohOQN7+?= =?Windows-1252?Q?hRtRpJkI91UX71TxF6ffnwVBV8da7krh3E89SZqxYNwbkhvqBFEhPKWU?= =?Windows-1252?Q?aClHD2Up389AzVhgwUbkJegqhwQ?= X-Microsoft-Exchange-Diagnostics: 1;BY2PR12MB0696;6:/yjaig9lpM/fLBhb6gEheT1tvDkJiMwoPAUQASt/YEaiBebPV9Cg58EMtaEVBC4jgmvSDMFbru6o6qz/Coxp1ZtYl6Nt5LAIVKpujmRdOm4mfnUKY3VKehUs0BqkkBPEb3ogwtZVKvpGLCzN9hlAbKBSrtINkuU6RLEnIqUeLlyK9mKoSagA56FqDL4mzcZ5l+PchvYQYK0n2uLVGUtwnkEzU7pvLOwoSmhIxjywDGhzfrKEkvB3SvBeKs6ZXd2JfBCXxGz+Gsan1NOuNHTT6pRXqvNwbRFjM3/1y3u/SW4RbHP1FitIKurBArXgsWvvEmNstm5LSfzs1rZjolI4X7gLlWCtpo+zJeIZ6JAR1/ptceZz3K7V85VmDVuwLKyU;5:ZmFUIq5a5Deyw7qBWSlK+mxfSl8QL/jiPIVhgkdqFZjVS+9PpKjFkXvm6qkpb2L+aO9vICINXiCamNVGuwxUJDgPHDahqVlqDOgPhSjuePD8PbJV6F/59ZTI3G0lCocxRbvby5r8+D8BtHFm3Uz+tQ==;24:McVi6QYbqboeS7ranYH+NTTL5w/I3BonOfwry0duAjBLZKqsz4ZpaCajRtrm3aROKL6Mr6b44yMppaFPbzbmPB7E7s+TpsgYbI36/DHuORI= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;BY2PR12MB0696;7:RNAvHwSwN0SOQewsvLliJLz7ehZ46t5zaBfl7/xqnVGWdfh+rSWTL5KB8i0nj8cBy47QduYAiDNPmAPDCITfVafMX4weXihVOHdPwVeL83XcWvH7C1TlKhwDVyZiFsXy9fGi5f3hhsYIKo+K6c9KZmEBSKKIEGf7Q32M1jdA8r4PowLvqMB/Ba61dRPj5vKVI6wjfLvORRA6kcY30qz+QE3XKPZ7vcGtQKdjbECg2RAOUFRO/W94F7XCrD04w0+WTeGmtsM+SQKot3fplHVDnS8gbpHyAcpubFzqtrU3MwqJJ4VtFVrN6PsYYuSK9In0VfRHfmT4r84b047emdcc3TJwn4GHNISko2Egg3AtgaI=;20:0x80p7PAiBVRWb229Iun7DKZc+L5N5rL0q1H/QaKsbmTR5cKp8eA8XNR7lDkPL6hXwNFVs+qXQemcMo6A/CKv9t+w7qgk4+WG9HuP5Sq/B6/g4hZinJrCeSAD4kx+RPuHA55ujgooh+bloAONM2gndICE902ZcKWx3lnvONQ8Dvnk7xPonbiromgfh5gSXh2DpIDFUOGkHyX1SS2jTAJJKm8Ds5inW31CeYRPurZmWopZAsnao4vsX4l96ZKLr+F X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Nov 2016 19:14:44.4347 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR12MB0696 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2838 Lines: 57 On 2016-11-23 02:05 PM, Jason Gunthorpe wrote: > On Wed, Nov 23, 2016 at 10:13:03AM -0700, Logan Gunthorpe wrote: > >> an MR would be very tricky. The MR may be relied upon by another host >> and the kernel would have to inform user-space the MR was invalid then >> user-space would have to tell the remote application. > As Bart says, it would be best to be combined with something like > Mellanox's ODP MRs, which allows a page to be evicted and then trigger > a CPU interrupt if a DMA is attempted so it can be brought back. Please note that in the general case (including MR one) we could have "page fault" from the different PCIe device. So all PCIe device must be synchronized. > includes the usual fencing mechanism so the CPU can block, flush, and > then evict a page coherently. > > This is the general direction the industry is going in: Link PCI DMA > directly to dynamic user page tabels, including support for demand > faulting and synchronicity. > > Mellanox ODP is a rough implementation of mirroring a process's page > table via the kernel, while IBM's CAPI (and CCIX, PCI ATS?) is > probably a good example of where this is ultimately headed. > > CAPI allows a PCI DMA to directly target an ASID associated with a > user process and then use the usual CPU machinery to do the page > translation for the DMA. This includes page faults for evicted pages, > and obviously allows eviction and migration.. > > So, of all the solutions in the original list, I would discard > anything that isn't VMA focused. Emulating what CAPI does in hardware > with software is probably the best choice, or we have to do it all > again when CAPI style hardware broadly rolls out :( > > DAX and GPU allocators should create VMAs and manipulate them in the > usual way to achieve migration, windowing, cache&mirror, movement or > swap of the potentially peer-peer memory pages. They would have to > respect the usual rules for a VMA, including pinning. > > DMA drivers would use the usual approaches for dealing with DMA from > a VMA: short term pin or long term coherent translation mirror. > > So, to my view (looking from RDMA), the main problem with peer-peer is > how do you DMA translate VMA's that point at non struct page memory? > > Does HMM solve the peer-peer problem? Does it do it generically or > only for drivers that are mirroring translation tables? In current form HMM doesn't solve peer-peer problem. Currently it allow "mirroring" of "malloc" memory on GPU which is not always what needed. Additionally there is need to have opportunity to share VRAM allocations between different processes. > From a RDMA perspective we could use something other than > get_user_pages() to pin and DMA translate a VMA if the core community > could decide on an API. eg get_user_dma_sg() would probably be quite > usable. > > Jason