On 2017-05-31 06:03, Stephen Boyd wrote:
> On 05/30, Kiran Gunda wrote:
>> From: Abhijeet Dharmapurikar <[email protected]>
>>
>> The system crashes due to bad access when reading from an non
>> configured
>> peripheral and when writing to peripheral which is not owned by
>> current
>> ee. This patch verifies ownership to avoid crashing on
>> write.
>
> What systems? As far as I know we don't have any bad accesses
> happening right now. If they are happening, we should fix the
> code that's accessing hardware that isn't owned by them.
>
This change greatly improves the debugging effort for developers by
printing
a very simple and clear error message when an invalid SPMI access occurs
(due to bad DT configuration, bad bootloader SPMI permission
configurations,
or other issues). Without this change, such accesses will cause XPU
violations
that crash the system and require extensive effort to decode.
>> For reads, since the forward mapping table, data_channel->ppid, is
>> towards the end of the block, we use the core size to figure the
>> max number of ppids supported. The table starts at an offset of 0x800
>> within the block, so size - 0x800 will give us the area used by the
>> table. Since each table is 4 bytes long (core_size - 0x800) / 4 will
>> gives us the number of data_channel supported.
>> This new protection is functional on hw v2.
>
> Which brings us to the next question which is why do we need this
> patch at all? We aren't probing hardware to see what we have
> access to and then populating device structures based on that.
> Instead, we're just populating DT nodes that we've hardcoded in
> the dts files, so I'm a little lost on why we would have a node
> in there that we couldn't access. Please add such details to the
> commit text.
>
invalid SPMI access occurs due to bad DT configuration, bad bootloader
SPMI
permission configurations, or other issues. This change reduces the
debugging
effort for developers by printing clear error message when an invalid
SPMI
access occurs.
>>
>> Signed-off-by: Abhijeet Dharmapurikar <[email protected]>
>> Signed-off-by: Kiran Gunda <[email protected]>
>> ---
>> drivers/spmi/spmi-pmic-arb.c | 84
>> +++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 83 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/spmi/spmi-pmic-arb.c
>> b/drivers/spmi/spmi-pmic-arb.c
>> index 5ec3a59..df463d4 100644
>> --- a/drivers/spmi/spmi-pmic-arb.c
>> +++ b/drivers/spmi/spmi-pmic-arb.c
>> @@ -111,6 +111,7 @@ enum pmic_arb_cmd_op_code {
>> * @ee: the current Execution Environment
>> * @min_apid: minimum APID (used for bounding IRQ search)
>> * @max_apid: maximum APID
>> + * @max_periph: maximum number of PMIC peripherals supported by HW.
>
> Nitpick: Most of these lines don't end with a full-stop.
>
Will address in the next clean up patch.
>> * @mapping_table: in-memory copy of PPID -> APID mapping table.
>> * @domain: irq domain object for PMIC IRQ domain
>> * @spmic: SPMI controller object
>> @@ -132,6 +133,7 @@ struct spmi_pmic_arb_dev {
>> u8 ee;
>> u16 min_apid;
>> u16 max_apid;
>> + u16 max_periph;
>> u32 *mapping_table;
>> DECLARE_BITMAP(mapping_table_valid, PMIC_ARB_MAX_PERIPHS);
>> struct irq_domain *domain;
>> @@ -140,11 +142,13 @@ struct spmi_pmic_arb_dev {
>> const struct pmic_arb_ver_ops *ver_ops;
>> u16 *ppid_to_chan;
>> u16 last_channel;
>> + u8 *chan_to_owner;
>
> And we didn't document this one.
>
Will document in the next clean up patch.
>> };
>>
>> /**
>> * pmic_arb_ver: version dependent functionality.
>> *
>> + * @mode: access rights to specified pmic peripheral.
>> * @non_data_cmd: on v1 issues an spmi non-data command.
>> * on v2 no HW support, returns -EOPNOTSUPP.
>> * @offset: on v1 offset of per-ee channel.
>> @@ -160,6 +164,8 @@ struct spmi_pmic_arb_dev {
>> * on v2 offset of SPMI_PIC_IRQ_CLEARn.
>> */
>> struct pmic_arb_ver_ops {
>> + int (*mode)(struct spmi_pmic_arb_dev *dev, u8 sid, u16 addr,
>> + mode_t *mode);
>> /* spmi commands (read_cmd, write_cmd, cmd) functionality */
>> int (*offset)(struct spmi_pmic_arb_dev *dev, u8 sid, u16 addr,
>> u32 *offset);
>> @@ -313,11 +319,23 @@ static int pmic_arb_read_cmd(struct
>> spmi_controller *ctrl, u8 opc, u8 sid,
>> u32 cmd;
>> int rc;
>> u32 offset;
>> + mode_t mode;
>>
>> rc = pmic_arb->ver_ops->offset(pmic_arb, sid, addr, &offset);
>> if (rc)
>> return rc;
>>
>> + rc = pmic_arb->ver_ops->mode(pmic_arb, sid, addr, &mode);
>> + if (rc)
>> + return rc;
>> +
>> + if (!(mode & S_IRUSR)) {
>
> Using mode_t for hardware access is odd. Perhaps just come up
> with some sort of READ/WRITE enum instead (if this sort of
> checking is even needed)?
>
Sure. Will address in the next clean up patch.
>> + dev_err(&pmic_arb->spmic->dev,
>
> The dev_err() just after uses ctrl->dev? Why not here?
>
Will address in the next clean up patch.
>> + "error: impermissible read from peripheral sid:%d addr:0x%x\n",
>> + sid, addr);
>> + return -EPERM;
>> + }
>> +
>> if (bc >= PMIC_ARB_MAX_TRANS_BYTES) {
>> dev_err(&ctrl->dev,
>> "pmic-arb supports 1..%d bytes per trans, but:%zu requested",
>> @@ -364,11 +382,23 @@ static int pmic_arb_write_cmd(struct
>> spmi_controller *ctrl, u8 opc, u8 sid,
>> u32 cmd;
>> int rc;
>> u32 offset;
>> + mode_t mode;
>>
>> rc = pmic_arb->ver_ops->offset(pmic_arb, sid, addr, &offset);
>> if (rc)
>> return rc;
>>
>> + rc = pmic_arb->ver_ops->mode(pmic_arb, sid, addr, &mode);
>> + if (rc)
>> + return rc;
>> +
>> + if (!(mode & S_IWUSR)) {
>> + dev_err(&pmic_arb->spmic->dev,
>
> The dev_err() just after uses ctrl->dev? Why not here?
>
Will address in the next clean up patch.
>> + "error: impermissible write to peripheral sid:%d addr:0x%x\n",
>> + sid, addr);
>> + return -EPERM;
>> + }
>> +
>> if (bc >= PMIC_ARB_MAX_TRANS_BYTES) {
>> dev_err(&ctrl->dev,
>> "pmic-arb supports 1..%d bytes per trans, but:%zu requested",
>> @@ -727,6 +757,13 @@ static int qpnpint_irq_domain_map(struct
>> irq_domain *d,
>> return 0;
>> }
>>
>> +static int
>> +pmic_arb_mode_v1(struct spmi_pmic_arb_dev *pa, u8 sid, u16 addr,
>> mode_t *mode)
>> +{
>> + *mode = S_IRUSR | S_IWUSR;
>> + return 0;
>
> If mode was only positive then errors could be negative and no
> access could be 0. Then we could just return the mode from the
> function instead of passing a pointer around.
>
Will modify in the next clean up patch.
>> +}
>> +
>> /* v1 offset per ee */
>> static int
>> pmic_arb_offset_v1(struct spmi_pmic_arb_dev *pa, u8 sid, u16 addr,
>> u32 *offset)
>> @@ -745,7 +782,11 @@ static u16 pmic_arb_find_chan(struct
>> spmi_pmic_arb_dev *pa, u16 ppid)
>> * PMIC_ARB_REG_CHNL is a table in HW mapping channel to ppid.
>> * ppid_to_chan is an in-memory invert of that table.
>> */
>> - for (chan = pa->last_channel; ; chan++) {
>> + for (chan = pa->last_channel; chan < pa->max_periph; chan++) {
>> + regval = readl_relaxed(pa->cnfg +
>> + SPMI_OWNERSHIP_TABLE_REG(chan));
>> + pa->chan_to_owner[chan] = SPMI_OWNERSHIP_PERIPH2OWNER(regval);
>> +
>> offset = PMIC_ARB_REG_CHNL(chan);
>> if (offset >= pa->core_size)
>> break;
>
> Seems like an unrelated change to the mapping logic?
>
Will address in the next clean up patch.
>> @@ -767,6 +808,27 @@ static u16 pmic_arb_find_chan(struct
>> spmi_pmic_arb_dev *pa, u16 ppid)
>> }
>>
>>
>> +static int
>> +pmic_arb_mode_v2(struct spmi_pmic_arb_dev *pa, u8 sid, u16 addr,
>> mode_t *mode)
>
> Probably spmi_pmic_arb_dev should be const here.
>
Will address in the next clean up patch.
>> +{
>> + u16 ppid = (sid << 8) | (addr >> 8);
>> + u16 chan;
>> + u8 owner;
>> +
>> + chan = pa->ppid_to_chan[ppid];
>> + if (!(chan & PMIC_ARB_CHAN_VALID))
>> + return -ENODEV;
>> +
>> + *mode = 0;
>> + *mode |= S_IRUSR;
>> +
>> + chan &= ~PMIC_ARB_CHAN_VALID;
>> + owner = pa->chan_to_owner[chan];
>> + if (owner == pa->ee)
>> + *mode |= S_IWUSR;
>> + return 0;
>> +}
>> +
>> /* v2 offset per ppid (chan) and per ee */
>> static int
>> pmic_arb_offset_v2(struct spmi_pmic_arb_dev *pa, u8 sid, u16 addr,
>> u32 *offset)
>> @@ -879,6 +943,12 @@ static int spmi_pmic_arb_probe(struct
>> platform_device *pdev)
>>
>> res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "core");
>> pa->core_size = resource_size(res);
>> + if (pa->core_size <= 0x800) {
>> + dev_err(&pdev->dev, "core_size is smaller than 0x800. Failing
>> Probe\n");
>
> Not sure why probe is capitalized.
>
Will address in the next clean up patch.
>> + err = -EINVAL;
>> + goto err_put_ctrl;
>> + }
>> +
>
> We don't need these sorts of DT validation checks. Please remove.
>
Will remove it in the next clean up patch.
>> core = devm_ioremap_resource(&ctrl->dev, res);
>> if (IS_ERR(core)) {
>> err = PTR_ERR(core);
>> @@ -899,6 +969,9 @@ static int spmi_pmic_arb_probe(struct
>> platform_device *pdev)
>> pa->core = core;
>> pa->ver_ops = &pmic_arb_v2;
>>
>> + /* the apid to ppid table starts at PMIC_ARB_REG_CHNL(0) */
>> + pa->max_periph = (pa->core_size - PMIC_ARB_REG_CHNL(0)) / 4;
>> +
>> res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>> "obsrvr");
>> pa->rd_base = devm_ioremap_resource(&ctrl->dev, res);
>>
>> res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "intr");
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
>> member of Code Aurora Forum, hosted by The Linux Foundation
>> --
>
> P.S. Please put a newline in your signature so it doesn't exceed
> 80 columns.
Thanks for the suggestion.
On 06/12, [email protected] wrote:
> On 2017-05-31 06:03, Stephen Boyd wrote:
> >On 05/30, Kiran Gunda wrote:
> >>From: Abhijeet Dharmapurikar <[email protected]>
> >>
> >>The system crashes due to bad access when reading from an non
> >>configured
> >>peripheral and when writing to peripheral which is not owned by
> >>current
> >>ee. This patch verifies ownership to avoid crashing on
> >>write.
> >
> >What systems? As far as I know we don't have any bad accesses
> >happening right now. If they are happening, we should fix the
> >code that's accessing hardware that isn't owned by them.
> >
> This change greatly improves the debugging effort for developers by
> printing
> a very simple and clear error message when an invalid SPMI access occurs
> (due to bad DT configuration, bad bootloader SPMI permission
> configurations,
> or other issues). Without this change, such accesses will cause XPU
> violations
> that crash the system and require extensive effort to decode.
Right, but they're easily detectable because we would know almost
immediately that something isn't working when we integrate a
change. If you update the DT and it stops working, the DT is bad.
If you update the bootloader and it stops working, the bootloader
is bad, etc.
>
> >>For reads, since the forward mapping table, data_channel->ppid, is
> >>towards the end of the block, we use the core size to figure the
> >>max number of ppids supported. The table starts at an offset of 0x800
> >>within the block, so size - 0x800 will give us the area used by the
> >>table. Since each table is 4 bytes long (core_size - 0x800) / 4 will
> >>gives us the number of data_channel supported.
> >>This new protection is functional on hw v2.
> >
> >Which brings us to the next question which is why do we need this
> >patch at all? We aren't probing hardware to see what we have
> >access to and then populating device structures based on that.
> >Instead, we're just populating DT nodes that we've hardcoded in
> >the dts files, so I'm a little lost on why we would have a node
> >in there that we couldn't access. Please add such details to the
> >commit text.
> >
> invalid SPMI access occurs due to bad DT configuration, bad
> bootloader SPMI
> permission configurations, or other issues. This change reduces the
> debugging
> effort for developers by printing clear error message when an
> invalid SPMI
> access occurs.
Well we also take an overhead on every read/write. Sure things
are slow so the overhead is negligible, but the permissions are
on a peripheral id basis, so really we should look into _not_
populating devices that aren't accessible in the first place.
Then we move the checks out of the read/write path and to a more
logical place whereby we prevent a driver from attempting to even
attach to read or write a register that is protected.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 2017-06-13 07:39, Stephen Boyd wrote:
> On 06/12, [email protected] wrote:
>> On 2017-05-31 06:03, Stephen Boyd wrote:
>> >On 05/30, Kiran Gunda wrote:
>> >>From: Abhijeet Dharmapurikar <[email protected]>
>> >>
>> >>The system crashes due to bad access when reading from an non
>> >>configured
>> >>peripheral and when writing to peripheral which is not owned by
>> >>current
>> >>ee. This patch verifies ownership to avoid crashing on
>> >>write.
>> >
>> >What systems? As far as I know we don't have any bad accesses
>> >happening right now. If they are happening, we should fix the
>> >code that's accessing hardware that isn't owned by them.
>> >
>> This change greatly improves the debugging effort for developers by
>> printing
>> a very simple and clear error message when an invalid SPMI access
>> occurs
>> (due to bad DT configuration, bad bootloader SPMI permission
>> configurations,
>> or other issues). Without this change, such accesses will cause XPU
>> violations
>> that crash the system and require extensive effort to decode.
>
> Right, but they're easily detectable because we would know almost
> immediately that something isn't working when we integrate a
> change. If you update the DT and it stops working, the DT is bad.
> If you update the bootloader and it stops working, the bootloader
> is bad, etc.
>
Ok. Will send a patch to remove this code in the next series.
>>
>> >>For reads, since the forward mapping table, data_channel->ppid, is
>> >>towards the end of the block, we use the core size to figure the
>> >>max number of ppids supported. The table starts at an offset of 0x800
>> >>within the block, so size - 0x800 will give us the area used by the
>> >>table. Since each table is 4 bytes long (core_size - 0x800) / 4 will
>> >>gives us the number of data_channel supported.
>> >>This new protection is functional on hw v2.
>> >
>> >Which brings us to the next question which is why do we need this
>> >patch at all? We aren't probing hardware to see what we have
>> >access to and then populating device structures based on that.
>> >Instead, we're just populating DT nodes that we've hardcoded in
>> >the dts files, so I'm a little lost on why we would have a node
>> >in there that we couldn't access. Please add such details to the
>> >commit text.
>> >
>> invalid SPMI access occurs due to bad DT configuration, bad
>> bootloader SPMI
>> permission configurations, or other issues. This change reduces the
>> debugging
>> effort for developers by printing clear error message when an
>> invalid SPMI
>> access occurs.
>
> Well we also take an overhead on every read/write. Sure things
> are slow so the overhead is negligible, but the permissions are
> on a peripheral id basis, so really we should look into _not_
> populating devices that aren't accessible in the first place.
> Then we move the checks out of the read/write path and to a more
> logical place whereby we prevent a driver from attempting to even
> attach to read or write a register that is protected.
Ok. Will remove this code in the next patch series and try to implement
it
as per your suggestion.