Received: by 10.192.165.148 with SMTP id m20csp1339425imm; Wed, 25 Apr 2018 17:11:39 -0700 (PDT) X-Google-Smtp-Source: AIpwx482WJLRnW5XYT3FTnfotg94Bu51EW4TirrAke01cp2dhN1EDDdEYyBPYU+oBiZi7ss48kYl X-Received: by 2002:a17:902:8f86:: with SMTP id z6-v6mr31793283plo.316.1524701499060; Wed, 25 Apr 2018 17:11:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524701499; cv=none; d=google.com; s=arc-20160816; b=MyEgGBNftt/owYO9NJnJGbo804bgwK3w/d2I2oYJ7MobgRqWRdEZ7f6vHFkgUfSWZy C0uD3jp6vsURnrxEXr9y5tQDvmp2SpUQH+gYq1mCpVfF9qXjJlw7THpihGIUn2yq/sNf zIWZ9UPftpH1LGZQDctSyPl3wS0kE0CR1MwAnfaeiE31wZrmBCqJTWj6y7iQnp6eFyKq ROdHEOmTW+l2usXKi1RMoFHQqzeEh0fjUnwBEbMVBV60ctPqq1CaTs5JnMxWIoTeUO6L l/+R+3FM/o13tP3JSB0cDc97NC5xFuPH8nnKxtI/oaOiwWs9RWn00G4VU/5hf/OuPCWw TeZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=d2VCd5lOV61oBg8Xr+QX4mxDWHuO4DDQ/BeMxn9DlH4=; b=uApVUHKb4XgC/nW2Oh05ZKwhuojdkofI44Mn7f+TVKEwSSpiK+ws1iAS9GMvCaLFmk Zz3gza+QuLyOD43f81Ux1LFrf6g3pqBrA2EGcsK+4uxEH2Q0UNfAIKcIPwQBWNKWmt3z 3bECM6AZ1UovCShPD4IENnEGIRceWXPTAc0FbNlaG4Bs1oANsH/SBVDYoD7Xu4gRAARX 1qrJZWjt++wjG1olrLOSktlVWY+nDSl2zhKGFiP5QHsuzVxI0+mJZyYONfcX1XT1U3Jj 0r94tPuBD0Y3RaatSMtKjSgeOBr7eCAsIw12LFdFg9Sq5NWBWwqtpyXRePgpfT4UVJRo pfuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=mQYtSyYW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 97-v6si17509152plm.548.2018.04.25.17.11.24; Wed, 25 Apr 2018 17:11:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=mQYtSyYW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751475AbeDZAKP (ORCPT + 99 others); Wed, 25 Apr 2018 20:10:15 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:34092 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750949AbeDZAKM (ORCPT ); Wed, 25 Apr 2018 20:10:12 -0400 Received: by mail-pg0-f65.google.com with SMTP id p10so14495183pgn.1; Wed, 25 Apr 2018 17:10:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=d2VCd5lOV61oBg8Xr+QX4mxDWHuO4DDQ/BeMxn9DlH4=; b=mQYtSyYWXrwRHhJWHhqgLrvJIZp4u6Us9Et2EyqefFb5bFt7RBy6DhsOnLppZO4O+2 7TGh4RpFQE2ddmHZ7J/Cx2R8+nyvUDiHT4frYWEYARcIzlZB41gz9/JtAWAZSrZT7xBV AikTAnRjxAzxDo8lL/BKk7PiIFdPrQ6NRyl9iVivz8Hehg0drttTrT6iPm9EZ3BX6e15 psp81jkKQyUWEGiyslvs6L/ELWkIX7OsDBCo0gi1NM388zmSvR2gNtgw1szaiODBNpNB rR4r7NMDU+Qr6grxTQBfLFXnDFwHeyJLvy4UjxQCTuwzuIISs2Ru/sfEDbLsC28rPyRQ YhQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=d2VCd5lOV61oBg8Xr+QX4mxDWHuO4DDQ/BeMxn9DlH4=; b=Gsa9/lYFKUk14rXQtby5y3Tkq3r2NlEey00SCoaS7tnSL+VbnLw0IgdBo2WjZp3y+Q TMxji1tZKY752Xm/ZOtRZco9V3u/rBtY0pqyZOnwuO99N5GiCVzkb/yWaGu4HdQ0QrZe Hgkx06JCCbVeL9f91AK/4rG+DtgPj76vVRDEs9sKrz1NREeoRGk/XeNU0rfygZGgW67e 4hNEgGbUeKF9MVowyVvDVq9cwGpzGQoJsxr/1FncEE4HnTlbaefWV6/UUtgx7Fc4spVH FJbcooh9vvwC30euEpDQ5+RmmYAicy5frlPxwfDoU6eD0yyoSP63lmVMT1d7xMAO6ien EVLQ== X-Gm-Message-State: ALQs6tCqWKhQFR0sgNHYlMC1jek6tWYzA5RwyKw8Io3T+6P2M5q6RWo8 gbwQXocWE7Mz6T5tO66oHqc= X-Received: by 2002:a17:902:8b84:: with SMTP id ay4-v6mr31051138plb.57.1524701411624; Wed, 25 Apr 2018 17:10:11 -0700 (PDT) Received: from [192.168.1.70] (c-24-6-192-50.hsd1.ca.comcast.net. [24.6.192.50]) by smtp.gmail.com with ESMTPSA id i127sm4863890pfc.154.2018.04.25.17.10.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Apr 2018 17:10:10 -0700 (PDT) Subject: Re: [PATCH v7 2/5] of: change overlay apply input data from unflattened to FDT To: Alan Tull Cc: Jan Kiszka , Rob Herring , Pantelis Antoniou , Pantelis Antoniou , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , "linux-kernel@vger.kernel.org" , Geert Uytterhoeven , Laurent Pinchart , Jailhouse References: <1520122673-11003-1-git-send-email-frowand.list@gmail.com> <1520122673-11003-3-git-send-email-frowand.list@gmail.com> <09e3db63-cbf9-52a2-ee77-520979f17fea@web.de> <7bbf615b-3cdd-6bb4-6918-33e48de4225d@gmail.com> <7bbb9472-9c96-6012-68e6-4ec2773c7732@gmail.com> <4483492d-37d2-63ad-6739-2cb297fa5058@gmail.com> <2e36bae1-b83d-2955-0f45-90b7944b552d@gmail.com> From: Frank Rowand Message-ID: <94d3d9e7-6b7f-07d8-82c2-407345d3158c@gmail.com> Date: Wed, 25 Apr 2018 17:10:09 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/25/18 17:07, Frank Rowand wrote: > Hi Alan, > > On 04/25/18 11:19, Alan Tull wrote: >> On Wed, Apr 25, 2018 at 12:41 PM, Frank Rowand wrote: >>> On 04/25/18 07:59, Alan Tull wrote: >>>> On Tue, Apr 24, 2018 at 3:56 PM, Frank Rowand wrote: >>>>> Hi Alan, >>>>> >>>>> On 04/23/18 15:38, Frank Rowand wrote: >>>>>> Hi Jan, >>>>>> >>>>>> + Alan Tull for fpga perspective >>>>>> >>>>>> On 04/22/18 03:30, Jan Kiszka wrote: >>>>>>> On 2018-04-11 07:42, Jan Kiszka wrote: >>>>>>>> On 2018-04-05 23:12, Rob Herring wrote: >>>>>>>>> On Thu, Apr 5, 2018 at 2:28 PM, Frank Rowand wrote: >>>>>>>>>> On 04/05/18 12:13, Jan Kiszka wrote: >>>>>>>>>>> On 2018-04-05 20:59, Frank Rowand wrote: >>>>>>>>>>>> Hi Jan, >>>>>>>>>>>> >>>>>>>>>>>> On 04/04/18 15:35, Jan Kiszka wrote: >>>>>>>>>>>>> Hi Frank, >>>>>>>>>>>>> >>>>>>>>>>>>> On 2018-03-04 01:17, frowand.list@gmail.com wrote: >>>>>>>>>>>>>> From: Frank Rowand >>>>>>>>>>>>>> >>>>>>>>>>>>>> Move duplicating and unflattening of an overlay flattened devicetree >>>>>>>>>>>>>> (FDT) into the overlay application code. To accomplish this, >>>>>>>>>>>>>> of_overlay_apply() is replaced by of_overlay_fdt_apply(). >>>>>>>>>>>>>> >>>>>>>>>>>>>> The copy of the FDT (aka "duplicate FDT") now belongs to devicetree >>>>>>>>>>>>>> code, which is thus responsible for freeing the duplicate FDT. The >>>>>>>>>>>>>> caller of of_overlay_fdt_apply() remains responsible for freeing the >>>>>>>>>>>>>> original FDT. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The unflattened devicetree now belongs to devicetree code, which is >>>>>>>>>>>>>> thus responsible for freeing the unflattened devicetree. >>>>>>>>>>>>>> >>>>>>>>>>>>>> These ownership changes prevent early freeing of the duplicated FDT >>>>>>>>>>>>>> or the unflattened devicetree, which could result in use after free >>>>>>>>>>>>>> errors. >>>>>>>>>>>>>> >>>>>>>>>>>>>> of_overlay_fdt_apply() is a private function for the anticipated >>>>>>>>>>>>>> overlay loader. >>>>>>>>>>>>> >>>>>>>>>>>>> We are using of_fdt_unflatten_tree + of_overlay_apply in the >>>>>>>>>>>>> (out-of-tree) Jailhouse loader driver in order to register a virtual >>>>>>>>>>>>> device during hypervisor activation with Linux. The DT overlay is >>>>>>>>>>>>> created from a a template but modified prior to application to account >>>>>>>>>>>>> for runtime-specific parameters. See [1] for the current implementation. >>>>>>>>>>>>> >>>>>>>>>>>>> I'm now wondering how to model that scenario best with the new API. >>>>>>>>>>>>> Given that the loader lost ownership of the unflattened tree but the >>>>>>>>>>>>> modification API exist only for the that DT state, I'm not yet seeing a >>>>>>>>>>>>> clear solution. Should we apply the template in disabled form (status = >>>>>>>>>>>>> "disabled"), modify it, and then activate it while it is already applied? >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the pointer to the driver - that makes it much easier to >>>>>>>>>>>> understand the use case and consider solutions. >>>>>>>>>>>> >>>>>>>>>>>> If you can make the changes directly on the FDT instead of on the >>>>>>>>>>>> expanded devicetree, then you could move to the new API. >>>>>>>>>>> >>>>>>>>>>> Are there some examples/references on how to edit FDTs in-place in the >>>>>>>>>>> kernel? I'd like to avoid writing the n-th FDT parser/generator. >>>>>>>>>> >>>>>>>>>> I don't know of any existing in-kernel edits of the FDT (but they might >>>>>>>>>> exist). The functions to access an FDT are in libfdt, which is in >>>>>>>>>> scripts/dtc/libfdt/. >>>>>>>>> >>>>>>>>> Let's please not go down that route of doing FDT modifications. There >>>>>>>>> is little reason to other than for early boot changes. And it is much >>>>>>>>> easier to work on unflattened trees. >>>>>>>> >>>>>>>> I just briefly looked into libfdt, and it would have meant building it >>>>>>>> into the module as there are no library functions exported by the kernel >>>>>>>> either. Another reason to drop that. >>>>>>>> >>>>>>>> What's apparently working now is the pattern I initially suggested: >>>>>>>> Register template with status = "disabled" as overlay, then prepare and >>>>>>>> apply changeset that contains all needed modifications and sets the >>>>>>>> status to "ok". I might be leaking additional resources, but to find >>>>>>>> that out, I will now finally have to resolve clean unbinding of the >>>>>>>> generic PCI host controller [1] first. >>>>>>> >>>>>>> static void free_overlay_changeset(struct overlay_changeset *ovcs) >>>>>>> { >>>>>>> [...] >>>>>>> /* >>>>>>> * TODO >>>>>>> * >>>>>>> * would like to: kfree(ovcs->overlay_tree); >>>>>>> * but can not since drivers may have pointers into this data >>>>>>> * >>>>>>> * would like to: kfree(ovcs->fdt); >>>>>>> * but can not since drivers may have pointers into this data >>>>>>> */ >>>>>>> >>>>>>> kfree(ovcs); >>>>>>> } >>>>>>> >>>>>>> What's this? I have kmemleak now jumping at me over this. Who is suppose >>>>>>> to plug these leaks? The caller of of_overlay_fdt_apply has no pointers >>>>>>> to those objects. I would say that's a regression of the new API. >>>>>> >>>>>> The problem already existed but it was hidden. We have never been able to >>>>>> kfree() these object because we do not know if there are any pointers into >>>>>> these objects. The new API makes the problem visible to kmemleak. >>>>>> >>>>>> The reason that we do not know if there are any pointers into these objects >>>>>> is that devicetree access APIs return pointers into the devicetree internal >>>>>> data structures (that is, into the overlay unflattened devicetree). If we >>>>>> want to be able to do the kfree()s, we could change the devicetree access >>>>>> APIs. >>>>>> >>>>>> The reason that pointers into the overlay flattened tree (ovcs->fdt) are >>>>>> also exposed is that the overlay unflattened devicetree property values >>>>>> are pointers into the overlay fdt. >>>>>> >>>>>> ** This paragraph becomes academic (and not needed) if the fix in the next >>>>>> paragraph can be implemented. ** >>>>>> I _think_ that the fdt issue __for overlays__ can be fixed somewhat easily. >>>>>> (I would want to read through the code again to make sure I'm not missing >>>>>> any issues.) If the of_fdt_unflatten_tree() called by of_overlay_fdt_apply() >>>>>> was modified so that property values were copied into newly allocated memory >>>>>> and the live tree property pointers were set to the copy instead of to >>>>>> the value in the fdt, then I _think_ the fdt could be freed in >>>>>> of_overlay_fdt_apply() after calling of_overlay_apply(). The code that >>>>>> frees a devicetree would also have to be aware of this change -- I'm not >>>>>> sure if that leads to ugly complications or if it is easy. The other >>>>>> question to consider is whether to make the same change to >>>>>> of_fdt_unflatten_tree() when it is called in early boot to unflatten >>>>>> the base devicetree. Doing so would increase the memory usage of the >>>>>> live tree (we would not be able to free the base fdt after unflattening >>>>>> it because we make the fdt visible in /sys/firmware/fdt -- though >>>>>> _maybe_ that could be conditioned on CONFIG_KEXEC). >>>>> >>>>> Question added below this paragraph. >>>>> >>>>> >>>>>> But all of the complexity of that fix is _only_ because of_overlay_apply() >>>>>> and of_overlay_remove() call overlay_notify(), passing in the overlay >>>>>> unflattened devicetree (which has pointers into the overlay fdt). Pointers >>>>>> into the overlay unflattened devicetree are then passed to the notifiers. >>>>>> (Again, I may be missing some other place that the overlay unflattened >>>>>> devicetree is made visible to other code -- a more thorough reading of >>>>>> the code is needed.) If the notifiers could be modified to accept the >>>>>> changeset list instead of of pointers to the fragments in the overlay >>>>>> unflattened devicetree then there would be no possibility of the notifiers >>>>>> keeping a pointer into the overlay fdt. I do not know if this is a >>>>>> practical change for the notifiers -- there are no callers of >>>>>> of_overlay_notifier_register() in the mainline kernel source. My >>>>>> recollection is that the overlay notifiers were added for the fpga >>>>>> subsystem. >>>>> >>>>> Can the fpga notifiers be changed to have the changeset as an input >>>>> instead of having the overlay devicetree fragment and target as an >>>>> input? >>>> >>>> I'll look into it. Just to be clear, are you suggesting passing >>>> struct overlay_changeset instead in the notifier? >>> >>> Ah, poor phrasing on my part. I meant a "struct of_changeset", as is >>> passed into __of_changeset_apply_entries(), which is called from >>> of_overlay_apply(). This means that the call to overlay_notify() >>> would have to move down a few lines to just after calling >>> build_changeset(). >> >> Ah yes, I thought it was looking too easy. :) I had it working with >> notify data passing the struct overlay_changeset, I was about to send >> you the patch. >> >> The FPGA code really wants the data as fragments, so it will know >> first of all what the target is. Passing of_changeset would mean that >> the code receiving the notification would be essentially be tasked >> reassembling the changeset into fragments. Perhaps it could be done, >> but it could easily be broken by changes to overlay.c and it would be >> ugly. That breaks the exact thing that I added overlay notifications >> for. I really need to see for each fragment what the target is, and >> all the properties together. > I inadvertently chopped off part of my reply. I think is also meant to say something like: As Jan pointed out in another email, the > approach I proposed here is not solving the underlying problem, but just > moving it to another place. So I am not going to pursue this approach. > > -Frank > >> >>> >>> >>>> struct overlay_changeset and struct fragment would have to be moved to a header. >>>> >>>>> >>>>> The changeset lists nodes and properties to be added, but does not >>>>> expose any pointers to the overlay fdt or the overlay unflattened >>>>> devicetree. This guarantees no leakage of pointers into the overlay >>>>> fdt or the overlay unflattened devicetree. The changeset contains >>>>> pointers to copies of data, but those copies are never freed (and >>>>> thus they are yet another existing memory leak). >>>>> >>>>> -Frank >>>>> >>>>>> Why is overlay_notify() the only issue related to unknown users having >>>>>> pointers into the overlay fdt? The answer is that the overlay code >>>>>> does not directly expose the overlay unflattened devicetree (and thus >>>>>> indirectly the overlay fdt) to the live devicetree -- when the >>>>>> overlay code creates the overlay changeset, it copies from the >>>>>> overlay unflattened devicetree and overlay fdt and only exposes >>>>>> pointers to the copies. >>>>>> >>>>>> And hopefully the issues with the overlay unflattened devicetree can >>>>>> be resolved in the same way as for the overlay fdt. >>>>>> >>>>>> -Frank >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > >