Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761467AbcKDKx2 (ORCPT ); Fri, 4 Nov 2016 06:53:28 -0400 Received: from mailout2.w1.samsung.com ([210.118.77.12]:11254 "EHLO mailout2.w1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760389AbcKDKxY (ORCPT ); Fri, 4 Nov 2016 06:53:24 -0400 X-AuditID: cbfec7f5-f79ce6d000004c54-bb-581c689f063d Subject: Re: [RFC 00/17] clk: Add per-controller locks to fix deadlocks To: Stephen Boyd , Krzysztof Kozlowski Cc: Michael Turquette , Stephen Warren , Lee Jones , Eric Anholt , Florian Fainelli , Ray Jui , Scott Branden , bcm-kernel-feedback-list@broadcom.com, Sylwester Nawrocki , Tomasz Figa , Kukjin Kim , Russell King , Mark Brown , linux-clk@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-samsung-soc@vger.kernel.org, linux-i2c@vger.kernel.org, alsa-devel@alsa-project.org, Charles Keepax , Javier Martinez Canillas , a.hajda@samsung.com, Bartlomiej Zolnierkiewicz From: Marek Szyprowski Message-id: <7933d51e-92a8-ca6d-84f0-70b22fe17568@samsung.com> Date: Fri, 04 Nov 2016 11:53:16 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-version: 1.0 In-reply-to: <20160909002432.GE13062@codeaurora.org> Content-type: text/plain; charset=utf-8; format=flowed Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA02Sa0hTYRjHec85OzuOpqe56sW8wCoILU2seDOxgqJDH8w+VBaBjjyo6NR2 VLIv2XBq834p5zAsyLxkqfOaZK2VipdcNjEtM01NMmeEBt22mjsL/PZ73uf/Pv/n//JSuKSC 9KDiElNYZaI8QUaKiPben6bdVbGeEXsKO3D09uGwAI2OGDHUrG0UoAf5PQS6MTNHIlvZuBAZ rpkB+rX8QoiWLA0YKpz9giOTqUmIOrPuEEg/OyZA3/KmBCjHugqQuauSRMXv7F2t6QmGjDe6 ARoZOIbmK1pI9HwpW4B+DOUSaPn+HECLz7IIVN/1Bxz2YJpWVCSj+tpLMuaxEZz5Oq4WMroP L+1lQT7GPNK9FzL6+usk03L3KvO5+DfJtBbajwpa6wGzovdmmm+dDnc9LwqJZhPi0lhlQGiU KLbIUkkmt4VczszqF2SAbwEa4EJBei+0jVYIeN4MX001khogoiR0NYBW22uML1YAnNXm4f9v lK9qcb5xD8Bi4wTgiwUAVZW12JrKnT4O3yx0kmsspcNhR3+vwwOn84TQmumzxiQdCDUWjUMj pkNhW12DQ0PQO+DCn2GH2yb6AnxXNobzmo3wR+kUoQEU5WLfQj3jzY8Mhp9saud4H9jSYHEs B+klCnbXWsCaHtJeUG9wBjgKvxcbnJHd4WJfq5BnT2guzSV4LrRnUfvxrAVw2CLm+SB83jfi 9HKFJe3lOD9eDHOyJLyEgZaCSYznI3BiUeV80R4Ax29+JIuAj25dGt26CLp1EW4DvB5I2VRO EcNy+/w5uYJLTYzxv5ik0AP7hx209X3vBNW9wUZAU0C2QawN2xohEcjTuHSFEUAKl0nF2TGe ERJxtDz9CqtMilSmJrCcEWylCNkW8ePbo2cldIw8hY1n2WRW+b+LUS4eGcCAaroWOdsp8tyw VDLtHRyknHzWqj8U+1Aq6psY9CsxiyOsJTV1yftdL7kZvAhl9rTKtYa+5dVSXuXmZh36aFI3 HnTfFTevTZtSxKdKB5Yb3f4mSzP8ok4ObKs6pLsa/sC3gxW6PD3jk3kg3bcprC0ICzoRaSZC qNzpmZ3bw2QEFysP9MWVnPwf1Z2QTqwDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrJKsWRmVeSWpSXmKPExsVy+t/xK7r1GTIRBvP+GVrcWneO1eLKxUNM FhtnrGe1WNt7lMVi6sMnbBb/ptxgtzjQeJnR4te7I+wWb96uYbLof/ya2eL8+Q3sFjvaFrJY bHp8jdXiY889VouOv18YLS7vmsNmMfE2UHbG+X1MFoem7mW0uHjK1eLpzM1sFofftLNa/DjT zWLxbvUTRotXB9tYLFbt+sPoIOWx4XMTm0fT+2NsHpevXWT2eH+jld1j1v2zQG5fL5PHzll3 2T02repk89i8pN7j5cTfbB5b+oFCfVtWMXp83iTnsXFuaABflJtNRmpiSmqRQmpecn5KZl66 rVJoiJuuhZJCXmJuqq1ShK5vSJCSQlliTimQZ2SABhycA9yDlfTtEtwyJrydw1aw1aaipe0k awPjR/0uRk4OCQETielfZjBD2GISF+6tZ+ti5OIQEljCKHF46Qoo5zmjxKfF05hAqoQF3CWu P98BlODgEBHwkzjyrxQkLCRQKPFyy0ZmkHpmgT52if9b/rCBJNgEDCW63naB2bwCdhJbV65h BbFZBFQlnv85B7ZZVCBG4vqzR1A1ghI/Jt9jAZnPCXRd60M5kDCzgJnEl5eHWSFseYnNa94y T2AUmIWkYxaSsllIyhYwMq9iFEktLc5Nzy020itOzC0uzUvXS87P3cQITDbbjv3csoOx613w IUYBDkYlHt4ZftIRQqyJZcWVuYcYJTiYlUR429NlIoR4UxIrq1KL8uOLSnNSiw8xmgL9MJFZ SjQ5H5gI80riDU0MzS0NjYwtLMyNjJTEead+uBIuJJCeWJKanZpakFoE08fEwSnVwJi6typJ 7eb5DcFOK3qevXQ5FM67RevA4febFm6Qu+S7+IZ0S/XpJZqMaomcly/e00mtM3Bq9lpxbMpa Q+1r+/OEJ/Gye5+o0Diue66lc2/U3Of6PBsEne1czm9fZZemLPXZ7aFRR1Rd1v3wW0XHu82D fXV5X0fozjM9quvxIOwawzc1zl8bbimxFGckGmoxFxUnAgCIwP1dTAMAAA== X-MTR: 20000000000000000@CPGS X-CMS-MailID: 20161104105318eucas1p2e9458dbe7039f2a81419fae8cf6bc4c8 X-Msg-Generator: CA X-Sender-IP: 182.198.249.180 X-Local-Sender: =?UTF-8?B?TWFyZWsgU3p5cHJvd3NraRtTUlBPTC1LZXJuZWwgKFRQKRs=?= =?UTF-8?B?7IK87ISx7KCE7J6QG1NlbmlvciBTb2Z0d2FyZSBFbmdpbmVlcg==?= X-Global-Sender: =?UTF-8?B?TWFyZWsgU3p5cHJvd3NraRtTUlBPTC1LZXJuZWwgKFRQKRtT?= =?UTF-8?B?YW1zdW5nIEVsZWN0cm9uaWNzG1NlbmlvciBTb2Z0d2FyZSBFbmdpbmVlcg==?= X-Sender-Code: =?UTF-8?B?QzEwG0VIURtDMTBDRDAyQ0QwMjczOTI=?= CMS-TYPE: 201P X-HopCount: 7 X-CMS-RootMailID: 20161104105318eucas1p2e9458dbe7039f2a81419fae8cf6bc4c8 X-RootMTR: 20161104105318eucas1p2e9458dbe7039f2a81419fae8cf6bc4c8 References: <1471354514-24224-1-git-send-email-k.kozlowski@samsung.com> <20160909002432.GE13062@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7565 Lines: 166 Hi Stephen, Krzysztof has left Samsung, but we would like to continue this task, because the ABBA dead-locks related to global prepare lock becomes more and more problematic for us to workaround. On 2016-09-09 02:24, Stephen Boyd wrote: > On 08/16, Krzysztof Kozlowski wrote: >> RFC, please, do not apply, maybe except patch #1 which is harmless. >> >> >> Introduction >> ============ >> The patchset brings new entity: clock controller representing a hardware >> block. The clock controller comes with its own prepare lock which >> is used then in many places. The idea is to fix the deadlock mentioned >> in commit 10ff4c5239a1 ("i2c: exynos5: Fix possible ABBA deadlock by keeping >> I2C clock prepared") and commit 34e81ad5f0b6 ("i2c: s3c2410: fix ABBA deadlock >> by keeping clock prepared"). >> >> >> Disclaimer >> ========== >> Request for comments, so: >> 1. Only exynos_defconfig builds, >> 2. A lot of FIXME/TODO note still, >> 3. Checkpatch not run, lines not aligned, >> 4. Other (non-exynos) drivers not converted, >> 5. Probably not yet bisectable, >> 6. Locking became quite complex. >> The previous one lock was simple. Inefficient and dead-lock prone but >> simple. Because of clock hierarchy spanning through controllers, the >> new locking became quite complicated. I don't like it but... >> >> >> Details >> ======= >> In Exynos-based boards case the deadlock occurs between clock's >> prepare_lock and regmap-i2c's lock: >> >> CPU #0: CPU #1: >> lock(regmap) >> s2mps11-clk: clk_prepare_lock() >> >> i2c-exynos: clk_prepare_lock() - wait >> lock(regmap) - wait >> >> The clk_prepare_lock() on both CPUs come from different clock drivers >> and components: >> 1. I2C clock is part of SoC block and is required by i2c-s3c2410/i2c-exynos5 >> driver, >> 2. S2MPS11 clock is separate device, however driver uses I2C regmap. >> >> The deadlock was reported by lockdep (always) and was happening >> in 20% of boots of Odroid XU3 with multi_v7 defconfig. Workaround for >> deadlock was implemented by removing prepare/unprepare calls from I2C >> transfers. However these are just workarounds... which after applying >> this patch can be reverted. >> >> Additionally Marek Szyprowski's work on domains/clocks/pinctrl exposed >> the deadlock again in different configuration. >> > Per-controller locks seems to be a misnomer. These patches look > more like per-clk_register() locks. The clk registrant can decide > how fine grained to make the prepare locking scheme. They can > decide to have one lock per clk, one lock for the entire tree, or > one lock per arbitrary subtree. I'd prefer to drop the controller > concept unless it's really done that way, by attaching to the OF > clk provider or device pointer. This was not phrased well but the idea was to let provider/controller have ability to define which clocks will use common lock. Probably removing the concept of clock controller and moving the lock to clk_provider will be much better approach. > Given that, it seems that conceptually these patches allow clk > registrants to carve out a set of clks that they want to be > treated as an atomic unit with regards to the prepare lock. > TL;DR trees of per-process recursive prepare mutex locks where > the locks cover one or more clks. > > The locks are recursive per-process to simplify the problem where > we need to acquire the same lock multiple times while traversing > a tree that shares the same lock, right? Otherwise we would need > to track which locks we've already grabbed and refcount and we > wouldn't be able to re-enter the framework from within the clk > ops of the same atomic unit. > > As long as we can guarantee that we always take the locks in the > same order we'll be fine. The burden to ensure that would be > placed on the clk registrants though, and verifying that will be > left up to humans? That seems fragile and error prone. We can > always say that we take the global lock and then traverse the > children up to the roots, but we can't be sure that during the > tree traversal we don't take locks in different order because of > how the clk registrant has structured things. > > It would be great if the different clk APIs were listed, with the > order of how locks are taken BTW. And really overall just more > information about how the locking scheme works. I had to read > this patch series quite a few times to come to (hopefully the > right) conclusions. Right, such change requires much more documentation and especially that it took significant amount of time to figure out all(?) possible use cases and locking schemes. > So I'm not very fond of this design because the locking scheme is > pretty much out of the hands of the framework and can be easily > broken. Well, switching from a single global lock to more granular locking is always a good approach. Please note that the global lock sooner or later became a serious bottleneck if one wants to make somehow more aggressive power management and clock gating. > I'm biased of course, because I'd prefer we go with my > wwmutex design of per-clk locks[1]. Taking locks in any order > works fine there, and we resolve quite a few long standing > locking problems that we have while improving scalability. The > problem there is that we don't get the recursive mutex design > (maybe that's a benefit!). Do you have any plan to continue working on your approach? per-clk wwmutex looks like an overkill on the first glance, but that's probably the only working solution if you want to get rid of recursive locks. I'm still not really convinced that we really need wwmutex here, especially if it is possible to guarantee the same order of locking operations inside the clock core. This requires a bit of cooperation from clock providers (with proper documentation and a list of DO/DON'T it shouldn't be that hard). > Once a clk_op reenters the framework > with consumer APIs and tries to grab the same lock we deadlock. > This is why I've been slowly splitting consumers from providers > so we can easily identify these cases. If we had something like > coordinated clk rate switching, we could get rid of clk_ops > reentering the framework and avoid this problem (and we really do > need to do that). I'm not sure that this makes really sense split consumers and providers. You will get recursive calls to clk core anyway with consumers calls if you are implementing i2c clock, for which an i2c bus driver does it's own clock gating (i2c controller uses consumer clk api). > The one thing I do like about this patch series though is the > gradual movement of providers to the new locking scheme. Maybe if > we combined that idea with per-clk wwmutex locks we could have > the best of both worlds and fix reentrancy later when we need it? > There's wwmutex_trylock() that we could use. Perhaps always pass > a NULL context if some clk flag isn't set (CLK_USE_PER_CLK_LOCK). > Plus we could assign the same wwmutex all over the place when > that flag isn't set and allocate a unique wwmutex at clk_register > time otherwise. I haven't thought about it deeply, so there may > be some glaring problem that I'm missing like when we have half > the clks in a tree with the flag set or something. This is where > driving home from the office for 45 minutes helps. > > [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-April/251039.html Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland