Hi,
On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
> +/*
> + * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
> + * and UEFI.
The mention of UEFI here worries me somewhat, and I have a number of
questions specifically relating to how we interact with UEFI here.
When precisely does UEFI need to touch the djtag hardware? e.g. does
this happen in runtime services? ... or completely asynchronously?
What does UEFI do with djtag when it holds the lock?
Are there other software agents (e.g. secure firmware) which try to
take this lock?
Can you explain how the locking scheme works? e.g. is this an advisory
software-only policy, or does the hardware prohibit accesses from other
agents somehow?
What happens if the kernel takes the lock, but doesn't release it?
What happens if UEFI takes the lock, but doesn't release it?
Are there any points at which UEFI needs to hold the locks of several
djtag units simultaneously?
> + * @reg_base: djtag register base address
> + *
> + * Return - none.
> + */
> +static void hisi_djtag_lock_v2(void __iomem *regs_base)
> +{
> + u32 rd, wr, mod_sel;
> +
> + /* Continuously poll to ensure the djtag is free */
> + while (1) {
> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
> + if (mod_sel == SC_DJTAG_V2_UNLOCK) {
> + wr = ((rd & SC_DJTAG_V2_MODULE_SEL_MASK) |
> + (SC_DJTAG_V2_KERNEL_LOCK <<
> + DEBUG_MODULE_SEL_SHIFT_EX));
> + writel(wr, regs_base + SC_DJTAG_MSTR_CFG_EX);
> + udelay(10); /* 10us */
> +
> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
> + if (mod_sel == SC_DJTAG_V2_KERNEL_LOCK)
> + break;
> + }
> + udelay(10); /* 10us */
> + }
> +}
> +
> +/*
> + * hisi_djtag_unlock_v2: djtag unlock
> + * @reg_base: djtag register base address
> + *
> + * Return - none.
> + */
> +static void hisi_djtag_unlock_v2(void __iomem *regs_base)
> +{
> + u32 rd, wr;
> +
> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
> +
> + wr = ((rd & SC_DJTAG_V2_MODULE_SEL_MASK) |
> + (SC_DJTAG_V2_UNLOCK << DEBUG_MODULE_SEL_SHIFT_EX));
> + /*
> + * Release djtag module by writing to module
> + * selection bits of DJTAG_MSTR_CFG register
> + */
> + writel(wr, regs_base + SC_DJTAG_MSTR_CFG_EX);
> +}
[...]
> +static void hisi_djtag_prepare_v2(void __iomem *regs_base, u32 offset,
> + u32 mod_sel, u32 mod_mask)
> +{
> + /* djtag mster enable */
s/mster/master/ ?
[...]
> +static ssize_t show_modalias(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct hisi_djtag_client *client = to_hisi_djtag_client(dev);
> +
> + return sprintf(buf, "%s%s\n", MODULE_PREFIX, dev_name(&client->dev));
> +}
> +static DEVICE_ATTR(modalias, 0444, show_modalias, NULL);
> +
> +static struct attribute *hisi_djtag_dev_attrs[] = {
> + /* modalias helps coldplug: modprobe $(cat .../modalias) */
> + &dev_attr_modalias.attr,
> + NULL
> +};
> +ATTRIBUTE_GROUPS(hisi_djtag_dev);
> +
> +static struct device_type hisi_djtag_client_type = {
> + .groups = hisi_djtag_dev_groups,
> +};
Can you elaborate on what this sysfs stuff is for?
> +static int hisi_djtag_device_match(struct device *dev,
> + struct device_driver *drv)
> +{
> + /* Attempt an OF style match */
> + if (of_driver_match_device(dev, drv))
> + return true;
> +
> +#ifdef CONFIG_ACPI
> + if (acpi_driver_match_device(dev, drv))
> + return true;
> +#endif
You can drop the ifdef here. When CONFIG_ACPI is not selected,
acpi_driver_match_device() is a static inline that always returns false.
> + return false;
> +}
[...]
> +static int hisi_djtag_set_client_name(struct hisi_djtag_client *client,
> + const char *device_name)
> +{
> + char name[DJTAG_CLIENT_NAME_LEN];
> + int id;
> +
> + id = hisi_djtag_get_client_id(client);
> + if (id < 0)
> + return id;
> +
> + client->id = id;
> + snprintf(name, DJTAG_CLIENT_NAME_LEN, "%s%s_%d", DJTAG_PREFIX,
> + device_name, client->id);
> + dev_set_name(&client->dev, "%s", name);
> +
> + return 0;
> +}
Given dev_set_name() takes a fmt string, you don't need the temporary
name here, and can have dev_set_name() handle that directly, e.g.
err = dev_set_name(&client->dev, %s%s_%d",
DJTAG_PREFIX, device_name, client->id);
if (err)
return err;
Given DJTAG_PREFIX is only used here, it would be better to fold it into
the format string, also.
> +
> +static int hisi_djtag_new_of_device(struct hisi_djtag_host *host,
> + struct device_node *node)
> +{
> + struct hisi_djtag_client *client;
> + int ret;
> +
> + client = hisi_djtag_client_alloc(host);
> + if (!client) {
> + dev_err(&host->dev, "DT: Client alloc fail!\n");
> + return -ENOMEM;
> + }
> +
> + client->dev.of_node = of_node_get(node);
> +
> + ret = hisi_djtag_set_client_name(client, node->name);
I don't think it's a good idea to take the name directly from the DT.
Can we please use a standard name, looked up based on the compatible
string? Then we can also use the same names for ACPI. Where there are
multiple instances, we can use the module-id too.
[...]
> +static void djtag_register_devices(struct hisi_djtag_host *host)
> +{
> + struct device_node *node;
> +
> + if (host->of_node) {
> + for_each_available_child_of_node(host->of_node, node) {
> + if (of_node_test_and_set_flag(node, OF_POPULATED))
> + continue;
> + if (hisi_djtag_new_of_device(host, node))
> + break;
Why do we stop, rather than rolling back and freeing what we allocated?
Either that, or we should return an error code, such that a higher level
can choose to do so.
> + }
> + } else if (host->acpi_handle) {
> + acpi_handle handle = host->acpi_handle;
> + acpi_status status;
> +
> + status = acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, 1,
> + djtag_add_new_acpi_device, NULL,
> + host, NULL);
> + if (ACPI_FAILURE(status)) {
> + dev_err(&host->dev,
> + "ACPI: Fail to read client devices!\n");
> + return;
Likewise, why aren't we rolling back?
[...]
> +#define DJTAG_CLIENT_NAME_LEN 32
I beleive this can go.
Thanks,
Mark.
On 08/06/2017 17:35, Mark Rutland wrote:
> Hi,
>
> On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
>> +/*
>> + * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
>> + * and UEFI.
>
> The mention of UEFI here worries me somewhat, and I have a number of
> questions specifically relating to how we interact with UEFI here.
>
Hi Mark,
This djtag locking mechanism is an advisory software-only policy. The
problem is the hardware designers made an interface which does not
consider multiple agents in the system concurrently accessing the djtag
registers.
System wide, djtag is used as an interface to other HW modules, but we
only use for perf HW in the kernel.
> When precisely does UEFI need to touch the djtag hardware? e.g. does
> this happen in runtime services? ... or completely asynchronously?
>
Actually it's trusted firmware which accesses for L3 cache management in
CPU hotplug
> What does UEFI do with djtag when it holds the lock?
>
As mentioned, cache management
> Are there other software agents (e.g. secure firmware) which try to
> take this lock?
>
No
> Can you explain how the locking scheme works? e.g. is this an advisory
> software-only policy, or does the hardware prohibit accesses from other
> agents somehow?
>
The locking scheme is a software solution to spinlock. It's uses djtag
module select register as the spinlock flag, to avoid using some shared
memory.
The tricky part is that there is no test-and-set hardware support, so we
use this algorithm:
- precondition: flag initially set unlocked
a. agent reads flag
- if not unlocked, continues to poll
- otherwise, writes agent's unique lock value to flag
b. agent waits defined amount of time *uninterrupted* and then checks
the flag
- if it is unchanged, it has the lock -> continue
- if it is changed, it means other agent is trying to access the
lock and got it, so it goes back to a.
c. has lock, so safe to access djtag
d. to unlock, release by writing "unlock" value to flag
> What happens if the kernel takes the lock, but doesn't release it?
>
This should not happen. We use spinlock_irqsave() when locking. However
I have noted that we can BUG if djtag access timeout, so we need to
release the lock at this point. I don't think the code handles this
properly now.
> What happens if UEFI takes the lock, but doesn't release it?
>
Again, we would not expect this to happen; but, if it does, Kernel
access should timeout.
> Are there any points at which UEFI needs to hold the locks of several
> djtag units simultaneously?
No
Thanks,
John
>
>> + * @reg_base: djtag register base address
>> + *
>> + * Return - none.
>> + */
>> +static void hisi_djtag_lock_v2(void __iomem *regs_base)
>> +{
>> + u32 rd, wr, mod_sel;
>> +
>> + /* Continuously poll to ensure the djtag is free */
>> + while (1) {
>> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
>> + if (mod_sel == SC_DJTAG_V2_UNLOCK) {
>> + wr = ((rd & SC_DJTAG_V2_MODULE_SEL_MASK) |
>> + (SC_DJTAG_V2_KERNEL_LOCK <<
>> + DEBUG_MODULE_SEL_SHIFT_EX));
>> + writel(wr, regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + udelay(10); /* 10us */
>> +
>> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
>> + if (mod_sel == SC_DJTAG_V2_KERNEL_LOCK)
>> + break;
>> + }
>> + udelay(10); /* 10us */
>> + }
>> +}
>> +
>> +/*
[ ... ]
On Fri, Jun 09, 2017 at 03:18:39PM +0100, John Garry wrote:
> On 08/06/2017 17:35, Mark Rutland wrote:
> >Hi,
> >
> >On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
> >>+/*
> >>+ * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
> >>+ * and UEFI.
> >
> >The mention of UEFI here worries me somewhat, and I have a number of
> >questions specifically relating to how we interact with UEFI here.
> >
>
> Hi Mark,
>
> This djtag locking mechanism is an advisory software-only policy. The
> problem is the hardware designers made an interface which does not consider
> multiple agents in the system concurrently accessing the djtag registers.
>
> System wide, djtag is used as an interface to other HW modules, but we only
> use for perf HW in the kernel.
>
> >When precisely does UEFI need to touch the djtag hardware? e.g. does
> >this happen in runtime services? ... or completely asynchronously?
> >
>
> Actually it's trusted firmware which accesses for L3 cache management in CPU
> hotplug
>
> >What does UEFI do with djtag when it holds the lock?
> >
>
> As mentioned, cache management
>
> >Are there other software agents (e.g. secure firmware) which try to
> >take this lock?
> >
>
> No
>
> >Can you explain how the locking scheme works? e.g. is this an advisory
> >software-only policy, or does the hardware prohibit accesses from other
> >agents somehow?
> >
>
> The locking scheme is a software solution to spinlock. It's uses djtag
> module select register as the spinlock flag, to avoid using some shared
> memory.
>
> The tricky part is that there is no test-and-set hardware support, so we use
> this algorithm:
> - precondition: flag initially set unlocked
>
> a. agent reads flag
> - if not unlocked, continues to poll
> - otherwise, writes agent's unique lock value to flag
> b. agent waits defined amount of time *uninterrupted* and then checks the
> flag
How do you figure out this time period? Doesn't it need to be no shorter
than the longest critical section?
Will
On 09/06/2017 15:30, Will Deacon wrote:
> On Fri, Jun 09, 2017 at 03:18:39PM +0100, John Garry wrote:
>> On 08/06/2017 17:35, Mark Rutland wrote:
>>> Hi,
>>>
>>> On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
>>>> +/*
>>>> + * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
>>>> + * and UEFI.
>>>
>>> The mention of UEFI here worries me somewhat, and I have a number of
>>> questions specifically relating to how we interact with UEFI here.
>>>
>>
>> Hi Mark,
>>
>> This djtag locking mechanism is an advisory software-only policy. The
>> problem is the hardware designers made an interface which does not consider
>> multiple agents in the system concurrently accessing the djtag registers.
>>
>> System wide, djtag is used as an interface to other HW modules, but we only
>> use for perf HW in the kernel.
>>
>>> When precisely does UEFI need to touch the djtag hardware? e.g. does
>>> this happen in runtime services? ... or completely asynchronously?
>>>
>>
>> Actually it's trusted firmware which accesses for L3 cache management in CPU
>> hotplug
>>
>>> What does UEFI do with djtag when it holds the lock?
>>>
>>
>> As mentioned, cache management
>>
>>> Are there other software agents (e.g. secure firmware) which try to
>>> take this lock?
>>>
>>
>> No
>>
>>> Can you explain how the locking scheme works? e.g. is this an advisory
>>> software-only policy, or does the hardware prohibit accesses from other
>>> agents somehow?
>>>
>>
>> The locking scheme is a software solution to spinlock. It's uses djtag
>> module select register as the spinlock flag, to avoid using some shared
>> memory.
>>
>> The tricky part is that there is no test-and-set hardware support, so we use
>> this algorithm:
>> - precondition: flag initially set unlocked
>>
>> a. agent reads flag
>> - if not unlocked, continues to poll
>> - otherwise, writes agent's unique lock value to flag
>> b. agent waits defined amount of time *uninterrupted* and then checks the
>> flag
>
> How do you figure out this time period? Doesn't it need to be no shorter
> than the longest critical section?
>
Hi Will,
As you know, we need to delay to guard against contenting set-and-check.
And the ratio in delay duration would be 2:1 for agents to guard against
race of the contended set-and-check.
As for the specific time, we were working the basis that a delay of 10us
would be more than adequate time for the set-and-check to complete.
Sorry, but I didn't get critical section question. Are you questioning
the possiblity of one agent getting the lock, doing it's djtag
operation, and releasing, all while other agent is waiting on it's own
set-and-check?
Thanks,
John
> Will
>
> .
>
On Fri, Jun 09, 2017 at 03:18:39PM +0100, John Garry wrote:
> On 08/06/2017 17:35, Mark Rutland wrote:
> >On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
> >>+/*
> >>+ * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
> >>+ * and UEFI.
> >
> >The mention of UEFI here worries me somewhat, and I have a number of
> >questions specifically relating to how we interact with UEFI here.
>
> Hi Mark,
>
> This djtag locking mechanism is an advisory software-only policy.
> The problem is the hardware designers made an interface which does
> not consider multiple agents in the system concurrently accessing
> the djtag registers.
>
> System wide, djtag is used as an interface to other HW modules, but
> we only use for perf HW in the kernel.
>
> >When precisely does UEFI need to touch the djtag hardware? e.g. does
> >this happen in runtime services? ... or completely asynchronously?
>
> Actually it's trusted firmware which accesses for L3 cache
> management in CPU hotplug
Ok.
What happens if the lock is already held by an agent in that case?
Does the FW block until the lock is released?
Can you elaborate on CPU hotplug? Which CPU is performing the
maintenance in this scenario, and when? Can this block other CPUs until
the lock is released?
What happens if another agent pokes the djtag (without acquiring the
lock) while FW is doing this? Can this result in issues on the secure
side?
[...]
> >Can you explain how the locking scheme works? e.g. is this an
> >advisory software-only policy, or does the hardware prohibit accesses
> >from other agents somehow?
>
> The locking scheme is a software solution to spinlock. It's uses
> djtag module select register as the spinlock flag, to avoid using
> some shared memory.
>
> The tricky part is that there is no test-and-set hardware support,
> so we use this algorithm:
> - precondition: flag initially set unlocked
>
> a. agent reads flag
> - if not unlocked, continues to poll
> - otherwise, writes agent's unique lock value to flag
> b. agent waits defined amount of time *uninterrupted* and then
> checks the flag
> - if it is unchanged, it has the lock -> continue
> - if it is changed, it means other agent is trying to access the
> lock and got it, so it goes back to a.
> c. has lock, so safe to access djtag
> d. to unlock, release by writing "unlock" value to flag
This does not sound safe to me. There's always the potential for a race,
no matter how long an agent waits.
> >What happens if the kernel takes the lock, but doesn't release it?
>
> This should not happen. We use spinlock_irqsave() when locking.
> However I have noted that we can BUG if djtag access timeout, so we
> need to release the lock at this point. I don't think the code
> handles this properly now.
I was worried aobut BUG() and friends, and also preempt kernels.
It doesn't sound like it's possible to make this robust.
> >What happens if UEFI takes the lock, but doesn't release it?
>
> Again, we would not expect this to happen; but, if it does, Kernel
> access should timeout.
... which they do not, in this patch series, as far as I can tell.
This doesn't sound safe at all. :/
Thanks,
Mark.
Hi Mark,
> What happens if the lock is already held by an agent in that case?
>
> Does the FW block until the lock is released?
The FW must also honour the contract - it must also block until the lock
is available.
This may sound bad, but, in reality, probablity of simultaneous perf and
hotplug access is very low.
>
> Can you elaborate on CPU hotplug? Which CPU is performing the
> maintenance in this scenario, and when? Can this block other CPUs until
> the lock is released?
>
> What happens if another agent pokes the djtag (without acquiring the
> lock) while FW is doing this? Can this result in issues on the secure
> side?
>
I need to check on this.
> [...]
>
>>> Can you explain how the locking scheme works? e.g. is this an
>>> advisory software-only policy, or does the hardware prohibit accesses
>> >from other agents somehow?
>>
>> The locking scheme is a software solution to spinlock. It's uses
>> djtag module select register as the spinlock flag, to avoid using
>> some shared memory.
>>
>> The tricky part is that there is no test-and-set hardware support,
>> so we use this algorithm:
>> - precondition: flag initially set unlocked
>>
>> a. agent reads flag
>> - if not unlocked, continues to poll
>> - otherwise, writes agent's unique lock value to flag
>> b. agent waits defined amount of time *uninterrupted* and then
>> checks the flag
>> - if it is unchanged, it has the lock -> continue
>> - if it is changed, it means other agent is trying to access the
>> lock and got it, so it goes back to a.
>> c. has lock, so safe to access djtag
>> d. to unlock, release by writing "unlock" value to flag
>
> This does not sound safe to me. There's always the potential for a race,
> no matter how long an agent waits.
>
>>> What happens if the kernel takes the lock, but doesn't release it?
>>
>> This should not happen. We use spinlock_irqsave() when locking.
>> However I have noted that we can BUG if djtag access timeout, so we
>> need to release the lock at this point. I don't think the code
>> handles this properly now.
>
> I was worried aobut BUG() and friends, and also preempt kernels.
>
> It doesn't sound like it's possible to make this robust.
>
>>> What happens if UEFI takes the lock, but doesn't release it?
>>
>> Again, we would not expect this to happen; but, if it does, Kernel
>> access should timeout.
>
> ... which they do not, in this patch series, as far as I can tell.
>
I noticed this also.
> This doesn't sound safe at all. :/
Right, we need to consider if this will fly at all.
At this point, we would rather concentrate on our new chipset, which is
based on same perf HW architecture (so much code reuse), but uses
directly mapped registers and *no djtag* - in this, most of the upstream
effort from all parties is not wasted.
Please advise.
Much appreciated,
John
>
> Thanks,
> Mark.
>
> .
>
On Fri, Jun 09, 2017 at 05:09:25PM +0100, John Garry wrote:
> At this point, we would rather concentrate on our new chipset, which
> is based on same perf HW architecture (so much code reuse), but uses
> directly mapped registers and *no djtag* - in this, most of the
> upstream effort from all parties is not wasted.
FWIW, I suspect the MMIO-based PMU is going to have a smoother path
upstream, assuming it's not shared with other agents (and we don't have
another locking scheme to contend with).
Thanks,
Mark.
Hi Mark,
On 2017/6/9 0:35, Mark Rutland wrote:
> Hi,
>
> On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
>> +/*
>> + * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
>> + * and UEFI.
>
> The mention of UEFI here worries me somewhat, and I have a number of
> questions specifically relating to how we interact with UEFI here.
>
> When precisely does UEFI need to touch the djtag hardware? e.g. does
> this happen in runtime services? ... or completely asynchronously?
>
> What does UEFI do with djtag when it holds the lock?
>
> Are there other software agents (e.g. secure firmware) which try to
> take this lock?
>
> Can you explain how the locking scheme works? e.g. is this an advisory
> software-only policy, or does the hardware prohibit accesses from other
> agents somehow?
>
> What happens if the kernel takes the lock, but doesn't release it?
>
> What happens if UEFI takes the lock, but doesn't release it?
>
> Are there any points at which UEFI needs to hold the locks of several
> djtag units simultaneously?
>
>> + * @reg_base: djtag register base address
>> + *
>> + * Return - none.
>> + */
>> +static void hisi_djtag_lock_v2(void __iomem *regs_base)
>> +{
>> + u32 rd, wr, mod_sel;
>> +
>> + /* Continuously poll to ensure the djtag is free */
>> + while (1) {
>> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
>> + if (mod_sel == SC_DJTAG_V2_UNLOCK) {
>> + wr = ((rd & SC_DJTAG_V2_MODULE_SEL_MASK) |
>> + (SC_DJTAG_V2_KERNEL_LOCK <<
>> + DEBUG_MODULE_SEL_SHIFT_EX));
>> + writel(wr, regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + udelay(10); /* 10us */
>> +
>> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
>> + mod_sel = ((rd >> DEBUG_MODULE_SEL_SHIFT_EX) & 0xFF);
>> + if (mod_sel == SC_DJTAG_V2_KERNEL_LOCK)
>> + break;
>> + }
>> + udelay(10); /* 10us */
>> + }
>> +}
>> +
>> +/*
>> + * hisi_djtag_unlock_v2: djtag unlock
>> + * @reg_base: djtag register base address
>> + *
>> + * Return - none.
>> + */
>> +static void hisi_djtag_unlock_v2(void __iomem *regs_base)
>> +{
>> + u32 rd, wr;
>> +
>> + rd = readl(regs_base + SC_DJTAG_MSTR_CFG_EX);
>> +
>> + wr = ((rd & SC_DJTAG_V2_MODULE_SEL_MASK) |
>> + (SC_DJTAG_V2_UNLOCK << DEBUG_MODULE_SEL_SHIFT_EX));
>> + /*
>> + * Release djtag module by writing to module
>> + * selection bits of DJTAG_MSTR_CFG register
>> + */
>> + writel(wr, regs_base + SC_DJTAG_MSTR_CFG_EX);
>> +}
>
> [...]
>
>> +static void hisi_djtag_prepare_v2(void __iomem *regs_base, u32 offset,
>> + u32 mod_sel, u32 mod_mask)
>> +{
>> + /* djtag mster enable */
>
> s/mster/master/ ?
>
True, this is one of my typo.
> [...]
>
>> +static ssize_t show_modalias(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct hisi_djtag_client *client = to_hisi_djtag_client(dev);
>> +
>> + return sprintf(buf, "%s%s\n", MODULE_PREFIX, dev_name(&client->dev));
>> +}
>> +static DEVICE_ATTR(modalias, 0444, show_modalias, NULL);
>> +
>> +static struct attribute *hisi_djtag_dev_attrs[] = {
>> + /* modalias helps coldplug: modprobe $(cat .../modalias) */
>> + &dev_attr_modalias.attr,
>> + NULL
>> +};
>> +ATTRIBUTE_GROUPS(hisi_djtag_dev);
>> +
>> +static struct device_type hisi_djtag_client_type = {
>> + .groups = hisi_djtag_dev_groups,
>> +};
>
> Can you elaborate on what this sysfs stuff is for?
>
Yes. It just displays the name of djtag device node and is useless. We will delete
it.
>> +static int hisi_djtag_device_match(struct device *dev,
>> + struct device_driver *drv)
>> +{
>> + /* Attempt an OF style match */
>> + if (of_driver_match_device(dev, drv))
>> + return true;
>> +
>> +#ifdef CONFIG_ACPI
>> + if (acpi_driver_match_device(dev, drv))
>> + return true;
>> +#endif
>
> You can drop the ifdef here. When CONFIG_ACPI is not selected,
> acpi_driver_match_device() is a static inline that always returns false.
>
Agree. We shall simplify it.
>> + return false;
>> +}
>
> [...]
>
>> +static int hisi_djtag_set_client_name(struct hisi_djtag_client *client,
>> + const char *device_name)
>> +{
>> + char name[DJTAG_CLIENT_NAME_LEN];
>> + int id;
>> +
>> + id = hisi_djtag_get_client_id(client);
>> + if (id < 0)
>> + return id;
>> +
>> + client->id = id;
>> + snprintf(name, DJTAG_CLIENT_NAME_LEN, "%s%s_%d", DJTAG_PREFIX,
>> + device_name, client->id);
>> + dev_set_name(&client->dev, "%s", name);
>> +
>> + return 0;
>> +}
>
> Given dev_set_name() takes a fmt string, you don't need the temporary
> name here, and can have dev_set_name() handle that directly, e.g.
>
> err = dev_set_name(&client->dev, %s%s_%d",
> DJTAG_PREFIX, device_name, client->id);
> if (err)
> return err;
>
>
> Given DJTAG_PREFIX is only used here, it would be better to fold it into
> the format string, also.
>
Agree, we will modify it.
>> +
>> +static int hisi_djtag_new_of_device(struct hisi_djtag_host *host,
>> + struct device_node *node)
>> +{
>> + struct hisi_djtag_client *client;
>> + int ret;
>> +
>> + client = hisi_djtag_client_alloc(host);
>> + if (!client) {
>> + dev_err(&host->dev, "DT: Client alloc fail!\n");
>> + return -ENOMEM;
>> + }
>> +
>> + client->dev.of_node = of_node_get(node);
>> +
>> + ret = hisi_djtag_set_client_name(client, node->name);
>
> I don't think it's a good idea to take the name directly from the DT.
>
> Can we please use a standard name, looked up based on the compatible
> string? Then we can also use the same names for ACPI. Where there are
> multiple instances, we can use the module-id too.
>
> [...]
>
Ok, shall modify it.
>> +static void djtag_register_devices(struct hisi_djtag_host *host)
>> +{
>> + struct device_node *node;
>> +
>> + if (host->of_node) {
>> + for_each_available_child_of_node(host->of_node, node) {
>> + if (of_node_test_and_set_flag(node, OF_POPULATED))
>> + continue;
>> + if (hisi_djtag_new_of_device(host, node))
>> + break;
>
> Why do we stop, rather than rolling back and freeing what we allocated?
>
> Either that, or we should return an error code, such that a higher level
> can choose to do so.
>
We need to improve the error handling and rollback.
>> + }
>> + } else if (host->acpi_handle) {
>> + acpi_handle handle = host->acpi_handle;
>> + acpi_status status;
>> +
>> + status = acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, 1,
>> + djtag_add_new_acpi_device, NULL,
>> + host, NULL);
>> + if (ACPI_FAILURE(status)) {
>> + dev_err(&host->dev,
>> + "ACPI: Fail to read client devices!\n");
>> + return;
>
> Likewise, why aren't we rolling back?
>
> [...]
>
>> +#define DJTAG_CLIENT_NAME_LEN 32
>
> I beleive this can go.
>
ok.
thanks
Shaokun
> Thanks,
> Mark.
>
> .
>
On Fri, Jun 09, 2017 at 04:10:12PM +0100, John Garry wrote:
> On 09/06/2017 15:30, Will Deacon wrote:
> >On Fri, Jun 09, 2017 at 03:18:39PM +0100, John Garry wrote:
> >>On 08/06/2017 17:35, Mark Rutland wrote:
> >>>Hi,
> >>>
> >>>On Mon, May 22, 2017 at 08:48:32PM +0800, Shaokun Zhang wrote:
> >>>>+/*
> >>>>+ * hisi_djtag_lock_v2: djtag lock to avoid djtag access conflict b/w kernel
> >>>>+ * and UEFI.
> >>>
> >>>The mention of UEFI here worries me somewhat, and I have a number of
> >>>questions specifically relating to how we interact with UEFI here.
> >>>
> >>
> >>Hi Mark,
> >>
> >>This djtag locking mechanism is an advisory software-only policy. The
> >>problem is the hardware designers made an interface which does not consider
> >>multiple agents in the system concurrently accessing the djtag registers.
> >>
> >>System wide, djtag is used as an interface to other HW modules, but we only
> >>use for perf HW in the kernel.
> >>
> >>>When precisely does UEFI need to touch the djtag hardware? e.g. does
> >>>this happen in runtime services? ... or completely asynchronously?
> >>>
> >>
> >>Actually it's trusted firmware which accesses for L3 cache management in CPU
> >>hotplug
> >>
> >>>What does UEFI do with djtag when it holds the lock?
> >>>
> >>
> >>As mentioned, cache management
> >>
> >>>Are there other software agents (e.g. secure firmware) which try to
> >>>take this lock?
> >>>
> >>
> >>No
> >>
> >>>Can you explain how the locking scheme works? e.g. is this an advisory
> >>>software-only policy, or does the hardware prohibit accesses from other
> >>>agents somehow?
> >>>
> >>
> >>The locking scheme is a software solution to spinlock. It's uses djtag
> >>module select register as the spinlock flag, to avoid using some shared
> >>memory.
> >>
> >>The tricky part is that there is no test-and-set hardware support, so we use
> >>this algorithm:
> >>- precondition: flag initially set unlocked
> >>
> >>a. agent reads flag
> >> - if not unlocked, continues to poll
> >> - otherwise, writes agent's unique lock value to flag
> >>b. agent waits defined amount of time *uninterrupted* and then checks the
> >>flag
> >
> >How do you figure out this time period? Doesn't it need to be no shorter
> >than the longest critical section?
> >
>
> Hi Will,
>
> As you know, we need to delay to guard against contenting set-and-check. And
> the ratio in delay duration would be 2:1 for agents to guard against race of
> the contended set-and-check.
>
> As for the specific time, we were working the basis that a delay of 10us
> would be more than adequate time for the set-and-check to complete.
>
> Sorry, but I didn't get critical section question. Are you questioning the
> possiblity of one agent getting the lock, doing it's djtag operation, and
> releasing, all while other agent is waiting on it's own set-and-check?
Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
and step (b) was on another). Still, I don't understand the need for the
timeout. If you instead read back the flag immediately, wouldn't it still
work? e.g.
lock:
Readl_relaxed flag
if (locked)
goto lock;
Writel_relaxed unique ID to flag
Readl flag
if (locked by somebody else)
goto lock;
<critical section>
unlock:
Writel unlocked value to flag
Given that we're dealing with iomem, I think it will work, but I could be
missing something obvious.
Thoughts?
Will
On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> and step (b) was on another). Still, I don't understand the need for the
> timeout. If you instead read back the flag immediately, wouldn't it still
> work? e.g.
>
>
> lock:
> Readl_relaxed flag
> if (locked)
> goto lock;
>
> Writel_relaxed unique ID to flag
> Readl flag
> if (locked by somebody else)
> goto lock;
>
> <critical section>
>
> unlock:
> Writel unlocked value to flag
>
>
> Given that we're dealing with iomem, I think it will work, but I could be
> missing something obvious.
Don't we have the race below where both threads can enter the critical
section?
// flag f initial zero (unlocked)
// t1, flag 1 // t2, flag 2
readl(f); // reads 0 l = readl(f); // reads 0
<thinks lock is free> <thinks lock is free>
writel(1, f);
readl(f); // reads 1
<thinks lock owned>
writel(2, f);
readl(f) // reads 2
<thinks lock owned>
<crticial section> <critical section>
Thanks,
Mark.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> and step (b) was on another). Still, I don't understand the need for the
> timeout. If you instead read back the flag immediately, wouldn't it still
> work? e.g.
>
>
> lock:
> Readl_relaxed flag
> if (locked)
> goto lock;
>
> Writel_relaxed unique ID to flag
> Readl flag
> if (locked by somebody else)
> goto lock;
>
> <critical section>
>
> unlock:
> Writel unlocked value to flag
I think the delay is to counter this:
Agent 1 Agent 2
read flag
not locked
read flag
not locked
write unique ID
read back
not locked by someone else
write unique ID
read back
not locked by someone else
With the delay present, this becomes:
Agent 1 Agent 2
read flag
not locked
read flag
not locked
write unique ID
delay
write unique ID
delay
read back
locked by agent 2
read back
not locked by someone else
For this to work, the delay has to be guaranteed to be greater than
the maximum duration that any agent takes between the initial read
and the write of its unique ID. The delay doesn't even have to be
identical between each agent, it just has to satisfy that condition.
The key thing though is that the reads and writes must happen when
the program intends them to, so I don't think the _relaxed variants
should be used here. If they're buffered, then the delay doesn't
have the desired effect.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
On Wed, Jun 14, 2017 at 11:42:30AM +0100, Mark Rutland wrote:
> On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> > Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> > and step (b) was on another). Still, I don't understand the need for the
> > timeout. If you instead read back the flag immediately, wouldn't it still
> > work? e.g.
> >
> >
> > lock:
> > Readl_relaxed flag
> > if (locked)
> > goto lock;
> >
> > Writel_relaxed unique ID to flag
> > Readl flag
> > if (locked by somebody else)
> > goto lock;
> >
> > <critical section>
> >
> > unlock:
> > Writel unlocked value to flag
> >
> >
> > Given that we're dealing with iomem, I think it will work, but I could be
> > missing something obvious.
>
> Don't we have the race below where both threads can enter the critical
> section?
>
> // flag f initial zero (unlocked)
>
> // t1, flag 1 // t2, flag 2
> readl(f); // reads 0 l = readl(f); // reads 0
>
> <thinks lock is free> <thinks lock is free>
>
> writel(1, f);
> readl(f); // reads 1
> <thinks lock owned>
> writel(2, f);
> readl(f) // reads 2
> <thinks lock owned>
>
> <crticial section> <critical section>
>
> Thanks,
> Mark.
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose
> the contents to any other person, use it for any purpose, or store or
> copy the information in any medium. Thank you.
Please ignore the disclaimer on this mail; my client was mis-configured.
I will ensure I avoid sending bogus disclaimers in future.
Thanks,
Mark.
On Wed, Jun 14, 2017 at 11:42:30AM +0100, Mark Rutland wrote:
> On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> > Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> > and step (b) was on another). Still, I don't understand the need for the
> > timeout. If you instead read back the flag immediately, wouldn't it still
> > work? e.g.
> >
> >
> > lock:
> > Readl_relaxed flag
> > if (locked)
> > goto lock;
> >
> > Writel_relaxed unique ID to flag
> > Readl flag
> > if (locked by somebody else)
> > goto lock;
> >
> > <critical section>
> >
> > unlock:
> > Writel unlocked value to flag
> >
> >
> > Given that we're dealing with iomem, I think it will work, but I could be
> > missing something obvious.
>
> Don't we have the race below where both threads can enter the critical
> section?
>
> // flag f initial zero (unlocked)
>
> // t1, flag 1 // t2, flag 2
> readl(f); // reads 0 l = readl(f); // reads 0
>
> <thinks lock is free> <thinks lock is free>
>
> writel(1, f);
> readl(f); // reads 1
> <thinks lock owned>
> writel(2, f);
> readl(f) // reads 2
> <thinks lock owned>
>
> <crticial section> <critical section>
Urgh, yeah, of course and *that's* what the udelay is trying to avoid,
by "ensuring" that the <thinks lock is free> time and subsequent write
propagation is all over before we re-read the flag.
John -- how much space do you have on this device? Do you have, e.g. a byte
for each CPU?
Will
On Wed, Jun 14, 2017 at 11:48:07AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> > Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> > and step (b) was on another). Still, I don't understand the need for the
> > timeout. If you instead read back the flag immediately, wouldn't it still
> > work? e.g.
> >
> >
> > lock:
> > Readl_relaxed flag
> > if (locked)
> > goto lock;
> >
> > Writel_relaxed unique ID to flag
> > Readl flag
> > if (locked by somebody else)
> > goto lock;
> >
> > <critical section>
> >
> > unlock:
> > Writel unlocked value to flag
>
> I think the delay is to counter this:
>
> Agent 1 Agent 2
> read flag
> not locked
> read flag
> not locked
> write unique ID
> read back
> not locked by someone else
> write unique ID
> read back
> not locked by someone else
>
> With the delay present, this becomes:
>
> Agent 1 Agent 2
> read flag
> not locked
> read flag
> not locked
> write unique ID
> delay
> write unique ID
> delay
> read back
> locked by agent 2
> read back
> not locked by someone else
>
> For this to work, the delay has to be guaranteed to be greater than
> the maximum duration that any agent takes between the initial read
> and the write of its unique ID. The delay doesn't even have to be
> identical between each agent, it just has to satisfy that condition.
I think that it also needs to account for write propagation delays.
> The key thing though is that the reads and writes must happen when
> the program intends them to, so I don't think the _relaxed variants
> should be used here. If they're buffered, then the delay doesn't
> have the desired effect.
If buffering is a concern, then I think the non-relaxed write has the
barrier on the wrong side, so relaxed + mb() would be better.
Will
On 14/06/2017 12:01, Will Deacon wrote:
> On Wed, Jun 14, 2017 at 11:42:30AM +0100, Mark Rutland wrote:
>> On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
>>> Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
>>> and step (b) was on another). Still, I don't understand the need for the
>>> timeout. If you instead read back the flag immediately, wouldn't it still
>>> work? e.g.
>>>
>>>
>>> lock:
>>> Readl_relaxed flag
>>> if (locked)
>>> goto lock;
>>>
>>> Writel_relaxed unique ID to flag
>>> Readl flag
>>> if (locked by somebody else)
>>> goto lock;
>>>
>>> <critical section>
>>>
>>> unlock:
>>> Writel unlocked value to flag
>>>
>>>
>>> Given that we're dealing with iomem, I think it will work, but I could be
>>> missing something obvious.
>>
>> Don't we have the race below where both threads can enter the critical
>> section?
>>
>> // flag f initial zero (unlocked)
>>
>> // t1, flag 1 // t2, flag 2
>> readl(f); // reads 0 l = readl(f); // reads 0
>>
>> <thinks lock is free> <thinks lock is free>
>>
>> writel(1, f);
>> readl(f); // reads 1
>> <thinks lock owned>
>> writel(2, f);
>> readl(f) // reads 2
>> <thinks lock owned>
>>
>> <crticial section> <critical section>
>
> Urgh, yeah, of course and *that's* what the udelay is trying to avoid,
> by "ensuring" that the <thinks lock is free> time and subsequent write
> propagation is all over before we re-read the flag.
>
> John -- how much space do you have on this device? Do you have, e.g. a byte
> for each CPU?
Hi Will,
To be clear, the agents in our case are the kernel and UEFI. Within the
kernel, we use a kernel spinlock to lock the same djtag between threads,
for these reasons:
- kernel has a native spinlock
- we are limited in locking values, as the lock flag is only a 8b field
in v2 hw (called module select)
Thanks,
John
>
> Will
>
> .
>
On Wed, Jun 14, 2017 at 12:35:07PM +0100, John Garry wrote:
> On 14/06/2017 12:01, Will Deacon wrote:
> >On Wed, Jun 14, 2017 at 11:42:30AM +0100, Mark Rutland wrote:
> >>On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
> >>>Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
> >>>and step (b) was on another). Still, I don't understand the need for the
> >>>timeout. If you instead read back the flag immediately, wouldn't it still
> >>>work? e.g.
> >>>
> >>>
> >>>lock:
> >>> Readl_relaxed flag
> >>> if (locked)
> >>> goto lock;
> >>>
> >>> Writel_relaxed unique ID to flag
> >>> Readl flag
> >>> if (locked by somebody else)
> >>> goto lock;
> >>>
> >>><critical section>
> >>>
> >>>unlock:
> >>> Writel unlocked value to flag
> >>>
> >>>
> >>>Given that we're dealing with iomem, I think it will work, but I could be
> >>>missing something obvious.
> >>
> >>Don't we have the race below where both threads can enter the critical
> >>section?
> >>
> >> // flag f initial zero (unlocked)
> >>
> >> // t1, flag 1 // t2, flag 2
> >> readl(f); // reads 0 l = readl(f); // reads 0
> >>
> >> <thinks lock is free> <thinks lock is free>
> >>
> >> writel(1, f);
> >> readl(f); // reads 1
> >> <thinks lock owned>
> >> writel(2, f);
> >> readl(f) // reads 2
> >> <thinks lock owned>
> >>
> >> <crticial section> <critical section>
> >
> >Urgh, yeah, of course and *that's* what the udelay is trying to avoid,
> >by "ensuring" that the <thinks lock is free> time and subsequent write
> >propagation is all over before we re-read the flag.
> >
> >John -- how much space do you have on this device? Do you have, e.g. a byte
> >for each CPU?
>
> Hi Will,
>
> To be clear, the agents in our case are the kernel and UEFI. Within the
> kernel, we use a kernel spinlock to lock the same djtag between threads, for
> these reasons:
> - kernel has a native spinlock
If we only have to effectively deal with two threads, then we might be able
to use something like Dekker's.
> - we are limited in locking values, as the lock flag is only a 8b field in
> v2 hw (called module select)
By 8b do you mean 8 bits or 8 bytes? If the latter, does it support sub-word
accesses?
Will
On 14/06/2017 12:40, Will Deacon wrote:
> On Wed, Jun 14, 2017 at 12:35:07PM +0100, John Garry wrote:
>> On 14/06/2017 12:01, Will Deacon wrote:
>>> On Wed, Jun 14, 2017 at 11:42:30AM +0100, Mark Rutland wrote:
>>>> On Wed, Jun 14, 2017 at 11:06:58AM +0100, Will Deacon wrote:
>>>>> Apologies, I misunderstood your algorithm (I thought step (a) was on one CPU
>>>>> and step (b) was on another). Still, I don't understand the need for the
>>>>> timeout. If you instead read back the flag immediately, wouldn't it still
>>>>> work? e.g.
>>>>>
>>>>>
>>>>> lock:
>>>>> Readl_relaxed flag
>>>>> if (locked)
>>>>> goto lock;
>>>>>
>>>>> Writel_relaxed unique ID to flag
>>>>> Readl flag
>>>>> if (locked by somebody else)
>>>>> goto lock;
>>>>>
>>>>> <critical section>
>>>>>
>>>>> unlock:
>>>>> Writel unlocked value to flag
>>>>>
>>>>>
>>>>> Given that we're dealing with iomem, I think it will work, but I could be
>>>>> missing something obvious.
>>>>
>>>> Don't we have the race below where both threads can enter the critical
>>>> section?
>>>>
>>>> // flag f initial zero (unlocked)
>>>>
>>>> // t1, flag 1 // t2, flag 2
>>>> readl(f); // reads 0 l = readl(f); // reads 0
>>>>
>>>> <thinks lock is free> <thinks lock is free>
>>>>
>>>> writel(1, f);
>>>> readl(f); // reads 1
>>>> <thinks lock owned>
>>>> writel(2, f);
>>>> readl(f) // reads 2
>>>> <thinks lock owned>
>>>>
>>>> <crticial section> <critical section>
>>>
>>> Urgh, yeah, of course and *that's* what the udelay is trying to avoid,
>>> by "ensuring" that the <thinks lock is free> time and subsequent write
>>> propagation is all over before we re-read the flag.
>>>
>>> John -- how much space do you have on this device? Do you have, e.g. a byte
>>> for each CPU?
>>
>> Hi Will,
>>
>> To be clear, the agents in our case are the kernel and UEFI. Within the
>> kernel, we use a kernel spinlock to lock the same djtag between threads, for
>> these reasons:
>> - kernel has a native spinlock
>
> If we only have to effectively deal with two threads, then we might be able
> to use something like Dekker's.
>
>> - we are limited in locking values, as the lock flag is only a 8b field in
>> v2 hw (called module select)
>
> By 8b do you mean 8 bits or 8 bytes? If the latter, does it support sub-word
> accesses?
8 bits
So the size depends: on v1 hw is a 6-bit field in a 32-bit register
(recent news to me), and on v2 hw it is a 8-bit field in a 32-bit register.
So for reading and writing the flag, we use readl/writel and also
necessary shifts+masks. Obviously this is not atomic, but the whole
process of write-and-check is not atomic - hence the delay.
I am not sure if sub-word access is required.
Thanks,
John
>
> Will
>
> .
>