Received: by 10.192.165.148 with SMTP id m20csp4839844imm; Tue, 24 Apr 2018 09:11:09 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpkr1EOUJWqf67664PMpftuc9bzsReMyvDBw3GdUzLFwAAz0mRsv8aYWCpXeVlJr1LB6bEQ X-Received: by 10.167.128.4 with SMTP id j4mr58631pfi.52.1524586268947; Tue, 24 Apr 2018 09:11:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524586268; cv=none; d=google.com; s=arc-20160816; b=Xlmg9QxAhfAb8rULMDFZYs8Rje4+fYYsPzyQ8TmPQwYPuppuPvOyIpL34iNFC7ytcT 2HSA6Mc/k3qmvhx+hiHOcYSpRVmseR58lDgU6jivG6pSfoWbIeE+VmbA4/RPkfcYXNFp oGawdgcUr8S4gOcWJTooW238QVp0qJZR40GEsE48H9KYbJcryNA3zcjSnt8BLwTeCwRr r0envk9yCg1jOT2YhscTVd9ne7jgbcBNYzdMPPc/2S50PPyfCNovMj4T47Ykuzvk46lY 8+NR3orGLSx83okbYrKsKsPbycj43HTxcrb8gWKbG+2EDrjQXEHNn9h+kU1An5teOZSk A79Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dmarc-filter :arc-authentication-results; bh=zH3WMgdyX997psAOGqS74HiDFgcFU5NcTGp+ECaIGXM=; b=PYLC0W05VsN/IBiC663pNFe0YhFL3XDWHRKPU6yi2ToQT1iC+H/kVIU6J3ryzmyjz1 QO29B7QcXpOFjhEZVvgVs+FP4qQAjwYmhyXtlCg6SEXJeo9IWfPJk5wgtDWuNk6EF96V 4OCyQ5h+2AD61VjIcQklQkQcLhT0ugHx/gc45xhK6iR1WN6m6Df0dZly22DaIS6b4EfI o1JTnV+LdJ7xaZ/plDRyU3MFH1Rx0gdjgUUhaFTD37ViBXxeQRmHYZAuxL/IVD0hYA0U EYrFfTm2Bwof+plLPejkHMqMMkWHWid4gw9Gvk6qpTUxAlDwSEJivEEx/f9LgVdyHbNw ha9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p14si13163283pff.211.2018.04.24.09.10.54; Tue, 24 Apr 2018 09:11:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752485AbeDXQJE (ORCPT + 99 others); Tue, 24 Apr 2018 12:09:04 -0400 Received: from mail.kernel.org ([198.145.29.99]:56688 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751460AbeDXQJB (ORCPT ); Tue, 24 Apr 2018 12:09:01 -0400 Received: from mail-yb0-f173.google.com (mail-yb0-f173.google.com [209.85.213.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AA25F217D4; Tue, 24 Apr 2018 16:09:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA25F217D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=atull@kernel.org Received: by mail-yb0-f173.google.com with SMTP id i69-v6so4462993ybg.2; Tue, 24 Apr 2018 09:09:00 -0700 (PDT) X-Gm-Message-State: ALQs6tCFVbPZ/3a/1yYJPTpPnurBiZOTWIF0Qfiq9VYUVrOa755YDIdN 88iQhVZ7CFGVx7gbQO9HUS3qVWpcswxzWH8eaak= X-Received: by 2002:a25:ef46:: with SMTP id w6-v6mr10524523ybm.267.1524586139704; Tue, 24 Apr 2018 09:08:59 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:8406:0:0:0:0:0 with HTTP; Tue, 24 Apr 2018 09:08:19 -0700 (PDT) In-Reply-To: <4422f58a-ca7c-16e6-e0df-63faea50f553@web.de> References: <1520122673-11003-1-git-send-email-frowand.list@gmail.com> <1520122673-11003-3-git-send-email-frowand.list@gmail.com> <09e3db63-cbf9-52a2-ee77-520979f17fea@web.de> <7bbf615b-3cdd-6bb4-6918-33e48de4225d@gmail.com> <7bbb9472-9c96-6012-68e6-4ec2773c7732@gmail.com> <4422f58a-ca7c-16e6-e0df-63faea50f553@web.de> From: Alan Tull Date: Tue, 24 Apr 2018 11:08:19 -0500 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v7 2/5] of: change overlay apply input data from unflattened to FDT To: Jan Kiszka Cc: Frank Rowand , Rob Herring , Pantelis Antoniou , Pantelis Antoniou , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , "linux-kernel@vger.kernel.org" , Geert Uytterhoeven , Laurent Pinchart , Jailhouse Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 24, 2018 at 12:29 AM, Jan Kiszka wrote: > On 2018-04-24 00:38, Frank Rowand wrote: >> Hi Jan, >> >> + Alan Tull for fpga perspective >> >> On 04/22/18 03:30, Jan Kiszka wrote: >>> On 2018-04-11 07:42, Jan Kiszka wrote: >>>> On 2018-04-05 23:12, Rob Herring wrote: >>>>> On Thu, Apr 5, 2018 at 2:28 PM, Frank Rowand wrote: >>>>>> On 04/05/18 12:13, Jan Kiszka wrote: >>>>>>> On 2018-04-05 20:59, Frank Rowand wrote: >>>>>>>> Hi Jan, >>>>>>>> >>>>>>>> On 04/04/18 15:35, Jan Kiszka wrote: >>>>>>>>> Hi Frank, >>>>>>>>> >>>>>>>>> On 2018-03-04 01:17, frowand.list@gmail.com wrote: >>>>>>>>>> From: Frank Rowand >>>>>>>>>> >>>>>>>>>> Move duplicating and unflattening of an overlay flattened devicetree >>>>>>>>>> (FDT) into the overlay application code. To accomplish this, >>>>>>>>>> of_overlay_apply() is replaced by of_overlay_fdt_apply(). >>>>>>>>>> >>>>>>>>>> The copy of the FDT (aka "duplicate FDT") now belongs to devicetree >>>>>>>>>> code, which is thus responsible for freeing the duplicate FDT. The >>>>>>>>>> caller of of_overlay_fdt_apply() remains responsible for freeing the >>>>>>>>>> original FDT. >>>>>>>>>> >>>>>>>>>> The unflattened devicetree now belongs to devicetree code, which is >>>>>>>>>> thus responsible for freeing the unflattened devicetree. >>>>>>>>>> >>>>>>>>>> These ownership changes prevent early freeing of the duplicated FDT >>>>>>>>>> or the unflattened devicetree, which could result in use after free >>>>>>>>>> errors. >>>>>>>>>> >>>>>>>>>> of_overlay_fdt_apply() is a private function for the anticipated >>>>>>>>>> overlay loader. >>>>>>>>> >>>>>>>>> We are using of_fdt_unflatten_tree + of_overlay_apply in the >>>>>>>>> (out-of-tree) Jailhouse loader driver in order to register a virtual >>>>>>>>> device during hypervisor activation with Linux. The DT overlay is >>>>>>>>> created from a a template but modified prior to application to account >>>>>>>>> for runtime-specific parameters. See [1] for the current implementation. >>>>>>>>> >>>>>>>>> I'm now wondering how to model that scenario best with the new API. >>>>>>>>> Given that the loader lost ownership of the unflattened tree but the >>>>>>>>> modification API exist only for the that DT state, I'm not yet seeing a >>>>>>>>> clear solution. Should we apply the template in disabled form (status = >>>>>>>>> "disabled"), modify it, and then activate it while it is already applied? >>>>>>>> >>>>>>>> Thank you for the pointer to the driver - that makes it much easier to >>>>>>>> understand the use case and consider solutions. >>>>>>>> >>>>>>>> If you can make the changes directly on the FDT instead of on the >>>>>>>> expanded devicetree, then you could move to the new API. >>>>>>> >>>>>>> Are there some examples/references on how to edit FDTs in-place in the >>>>>>> kernel? I'd like to avoid writing the n-th FDT parser/generator. >>>>>> >>>>>> I don't know of any existing in-kernel edits of the FDT (but they might >>>>>> exist). The functions to access an FDT are in libfdt, which is in >>>>>> scripts/dtc/libfdt/. >>>>> >>>>> Let's please not go down that route of doing FDT modifications. There >>>>> is little reason to other than for early boot changes. And it is much >>>>> easier to work on unflattened trees. >>>> >>>> I just briefly looked into libfdt, and it would have meant building it >>>> into the module as there are no library functions exported by the kernel >>>> either. Another reason to drop that. >>>> >>>> What's apparently working now is the pattern I initially suggested: >>>> Register template with status = "disabled" as overlay, then prepare and >>>> apply changeset that contains all needed modifications and sets the >>>> status to "ok". I might be leaking additional resources, but to find >>>> that out, I will now finally have to resolve clean unbinding of the >>>> generic PCI host controller [1] first. >>> >>> static void free_overlay_changeset(struct overlay_changeset *ovcs) >>> { >>> [...] >>> /* >>> * TODO >>> * >>> * would like to: kfree(ovcs->overlay_tree); >>> * but can not since drivers may have pointers into this data >>> * >>> * would like to: kfree(ovcs->fdt); >>> * but can not since drivers may have pointers into this data >>> */ >>> >>> kfree(ovcs); >>> } >>> >>> What's this? I have kmemleak now jumping at me over this. Who is suppose >>> to plug these leaks? The caller of of_overlay_fdt_apply has no pointers >>> to those objects. I would say that's a regression of the new API. >> >> The problem already existed but it was hidden. We have never been able to >> kfree() these object because we do not know if there are any pointers into >> these objects. The new API makes the problem visible to kmemleak. > > My old code didn't have the problem because there was no one steeling > pointers to my overlay, and I was able to safely release all the > resources that I or the core on my behalf allocated. In fact, I recently > even dropped the duplication the fdt prior to unflattening it because I > got its lifecycle under control (and both kmemleak as well as kasan > confirmed this). I still consider this intentional leak a regression of > the new API. > >> >> The reason that we do not know if there are any pointers into these objects >> is that devicetree access APIs return pointers into the devicetree internal >> data structures (that is, into the overlay unflattened devicetree). If we >> want to be able to do the kfree()s, we could change the devicetree access >> APIs. >> >> The reason that pointers into the overlay flattened tree (ovcs->fdt) are >> also exposed is that the overlay unflattened devicetree property values >> are pointers into the overlay fdt. >> >> ** This paragraph becomes academic (and not needed) if the fix in the next >> paragraph can be implemented. ** >> I _think_ that the fdt issue __for overlays__ can be fixed somewhat easily. >> (I would want to read through the code again to make sure I'm not missing >> any issues.) If the of_fdt_unflatten_tree() called by of_overlay_fdt_apply() >> was modified so that property values were copied into newly allocated memory >> and the live tree property pointers were set to the copy instead of to >> the value in the fdt, then I _think_ the fdt could be freed in >> of_overlay_fdt_apply() after calling of_overlay_apply(). The code that > > I don't see yet how more duplicating of objects would help. Then we > would not leak the fdt or the unflattened tree on overlay destruction > but that duplicates, no? > >> frees a devicetree would also have to be aware of this change -- I'm not >> sure if that leads to ugly complications or if it is easy. The other >> question to consider is whether to make the same change to >> of_fdt_unflatten_tree() when it is called in early boot to unflatten >> the base devicetree. Doing so would increase the memory usage of the >> live tree (we would not be able to free the base fdt after unflattening >> it because we make the fdt visible in /sys/firmware/fdt -- though >> _maybe_ that could be conditioned on CONFIG_KEXEC). >> >> But all of the complexity of that fix is _only_ because of_overlay_apply() >> and of_overlay_remove() call overlay_notify(), passing in the overlay >> unflattened devicetree (which has pointers into the overlay fdt). Pointers >> into the overlay unflattened devicetree are then passed to the notifiers. >> (Again, I may be missing some other place that the overlay unflattened >> devicetree is made visible to other code -- a more thorough reading of >> the code is needed.) If the notifiers could be modified to accept the >> changeset list instead of of pointers to the fragments in the overlay >> unflattened devicetree then there would be no possibility of the notifiers >> keeping a pointer into the overlay fdt. I do not know if this is a > > But then again the convention has to be that those changeset pointers > must not be kept - because the changeset is history after of_overlay_remove. > >> practical change for the notifiers -- there are no callers of >> of_overlay_notifier_register() in the mainline kernel source. My >> recollection is that the overlay notifiers were added for the fpga >> subsystem. That's right. > > We have drivers/fpga/of-fpga-region.c in-tree, and that does not seem to > store any pointers to objects, rather consumes them in-place. And I > would consider it fair to impose such a limitation on the notifier > interface. The FPGA code was written assuming that overlays could be removed. > >> >> Why is overlay_notify() the only issue related to unknown users having >> pointers into the overlay fdt? The answer is that the overlay code >> does not directly expose the overlay unflattened devicetree (and thus >> indirectly the overlay fdt) to the live devicetree -- when the >> overlay code creates the overlay changeset, it copies from the >> overlay unflattened devicetree and overlay fdt and only exposes >> pointers to the copies. >> >> And hopefully the issues with the overlay unflattened devicetree can >> be resolved in the same way as for the overlay fdt. > > As noted above, I don't see there is a technical solution to this issue > but it's rather a matter of convention: no overlay notifier callback is > allowed to keep references to the passed tree content (unless it > reference-counts some tree nodes) beyond the execution of the callback. > With that in place, we can safely drop the backing memory IMHO. > > Jan