2018-02-21 10:50:38

by Arend Van Spriel

[permalink] [raw]
Subject: [PATCH 0/3] drivers: support for sysfs initiated coredump

This series is intended for 4.17 adding support for sysfs initiated
coredump. This uses new functionality that was added in drivers base.
Device drivers can now implement a .coredump() callback upon which a
sysfs entry is created when the device is bound to the driver. From
user-space a device coredump can be initiated. The easiest way is by
going the drivers entry and enter into the bound device folder like
this:

# cd /sys/bus/pci/drivers/brcmfmac/0000:02:00.0
# echo 1 > coredump
# ls /sys/class/devcoredump/
devcd1 disabled
# ls -l /sys/class/devcoredump/devcd1/
total 0
-rw------- 1 root root 0 Feb 19 23:49 data
lrwxrwxrwx 1 root root 0 Feb 19 23:49 failing_device ->
../../../pci0000:00/0000:00:1c.0/0000:02:00.0
drwxr-xr-x 2 root root 0 Feb 19 23:49 power
lrwxrwxrwx 1 root root 0 Feb 19 23:49 subsystem ->
../../../../class/devcoredump
-rw-r--r-- 1 root root 4096 Feb 19 23:49 uevent

The device driver can implement the .coredump() callback as they like.
The use of the dev_coredump api is not enforced although the sysfs entry
is only created when CONFIG_DEV_COREDUMP is selected.

Apart from brcmfmac, the other drivers used dev_coredump api and had a
mechanism in place through debugfs, which is removed in these patches.
With these patches initiating the coredump can be done without selecting
CONFIG_DEBUGFS. No attempt was made to look for drivers providing some sore
of coredump functionality by other means than the dev_coredump api.

The first 2 patches apply to the master branch of the wireless-drivers-next
repository. The last patch applies to the master branch of the bluetooth-next.

Arend van Spriel (3):
brcmfmac: add support for sysfs initiated coredump
mwifiex: support sysfs initiated device coredump
btmrvl: support sysfs initiated firmware coredump

drivers/bluetooth/btmrvl_debugfs.c | 31 ----------------------
drivers/bluetooth/btmrvl_drv.h | 2 --
drivers/bluetooth/btmrvl_main.c | 6 -----
drivers/bluetooth/btmrvl_sdio.c | 18 ++++++++-----
.../wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c | 1 +
.../net/wireless/broadcom/brcm80211/brcmfmac/bus.h | 2 ++
.../wireless/broadcom/brcm80211/brcmfmac/core.c | 7 +++++
.../wireless/broadcom/brcm80211/brcmfmac/pcie.c | 1 +
drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +---------------------
drivers/net/wireless/marvell/mwifiex/pcie.c | 19 +++++++++++--
drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++
drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++
12 files changed, 68 insertions(+), 77 deletions(-)

--
1.9.1


2018-02-23 10:51:07

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

Hey,

Hadn't really followed this discussion much due to other fires
elsewhere :-)

On Fri, 2018-02-23 at 11:39 +0100, Arend van Spriel wrote:

> > > Well, that depends on the eye of the beholder I guess. From user-space
> > > perspective it is asynchronous regardless. A write access to the coredump
> > > sysfs file eventually results in a uevent when the devcoredump entry is
> > > created, ie. after driver has made a dev_coredump API call. Whether the
> > > driver does that synchronously or asynchronously is irrelevant as far as
> > > user-space is concerned.
> >
> > Is it really? The driver infrastructure seems to guarantee that the
> > entirety of a driver's ->coredump() will complete before returning from
> > the write. So it might be reasonable for some user to assume (based on
> > implementation details, e.g., of brcmfmac) that the devcoredump will be
> > ready by the time the write() syscall returns, absent documentation that
> > says otherwise. But then, that's not how mwifiex works right now, so
> > they might be surprised if they switch drivers.

I can see how you might want to have that kind of behaviour, but you'd
have to jump through some hoops to see if the coredump you saw is
actually the right one - you probably want an asynchronous coredump
"collector" and then wait for it to show up (with some reasonable
timeout) on the actual filesystem, not on sysfs?

Otherwise you have to trawl sysfs for the right coredump I guess, which
too is possible.

> > > You are right. Clearly I did not reach the end my learning curve here. I
> > > assumed referring to the existing dev_coredump facility was sufficient, but
> > > maybe it is worth a patch to be more explicit and mention the uevent
> > > behavior. Also dev_coredump facility may be disabled upon which the trigger
> > > will have no effect in sysfs. In the kernel the data passed by the driver is
> > > simply freed by dev_coredump facility.
> >
> > Is there any other documentation for the coredump feature? I don't
> > really see much.
>
> Any other than the code itself you mean? I am not sure. Maybe Johannes
> knows.

There isn't really, it originally was really simple, but then somebody
(Kees perhaps?) requested a way to turn it off forever for security or
privacy concerns and it became more complicated.

> > static ssize_t coredump_store(struct device *dev, struct device_attribute *attr,
> > const char *buf, size_t count)
> > {
> > device_lock(dev);
> > if (dev->driver->coredump)
> > dev->driver->coredump(dev);
> > device_unlock(dev);
> >
> > return count;
> > }
> > static DEVICE_ATTR_WO(coredump);
> >
> > Is that a bug or a feature?
>
> Yeah. Let's call it a bug. Just not sure what to go for. Return the
> error or change coredump callback to void return type.

I'm not sure it matters all that much - the underlying devcoredump
calls all have no return value (void), and given the above complexities
with the ability to turn off devcoredumping entirely you cannot rely on
this return value to tell you if a dump was created or not, at least
not without much more infrastructure work.

johannes

2018-02-27 18:26:05

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [PATCH 3/3] btmrvl: support sysfs initiated firmware coredump

Hi Arend,

> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
> it is possible to initiate a device coredump from user-space. This
> patch adds support for it in btmrvl_sdio adding the .coredump()
> driver callback. This makes dump through debugfs obsolete so removing
> it.
>
> Signed-off-by: Arend van Spriel <[email protected]>
> ---
> drivers/bluetooth/btmrvl_debugfs.c | 31 -------------------------------
> drivers/bluetooth/btmrvl_drv.h | 2 --
> drivers/bluetooth/btmrvl_main.c | 6 ------
> drivers/bluetooth/btmrvl_sdio.c | 18 ++++++++++++------
> 4 files changed, 12 insertions(+), 45 deletions(-)

patch has been applied to bluetooth-next tree.

Regards

Marcel

2018-02-27 14:46:40

by Kalle Valo

[permalink] [raw]
Subject: Re: [3/3] btmrvl: support sysfs initiated firmware coredump

Arend Van Spriel <[email protected]> wrote:

> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
> it is possible to initiate a device coredump from user-space. This
> patch adds support for it in btmrvl_sdio adding the .coredump()
> driver callback. This makes dump through debugfs obsolete so removing
> it.
>
> Signed-off-by: Arend van Spriel <[email protected]>

This is for bluetooth tree, dropping from my queue.

Patch set to Not Applicable.

--
https://patchwork.kernel.org/patch/10231923/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2018-02-21 10:50:39

by Arend Van Spriel

[permalink] [raw]
Subject: [PATCH 1/3] brcmfmac: add support for sysfs initiated coredump

The driver already supports device coredump initiated by firmware
event. Since commit 3c47d19ff4dc ("drivers: base: add coredump driver
ops") it is also possible to initiate it from user-space through
sysfs. This patch adds support for SDIO and PCIe devices.

Reviewed-by: Hante Meuleman <[email protected]>
Reviewed-by: Pieter-Paul Giesberts <[email protected]>
Reviewed-by: Franky Lin <[email protected]>
Signed-off-by: Arend van Spriel <[email protected]>
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c | 1 +
drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h | 2 ++
drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c | 7 +++++++
drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 1 +
4 files changed, 11 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
index 0b68240..3220b69 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
@@ -1164,6 +1164,7 @@ static int brcmf_ops_sdio_resume(struct device *dev)
#ifdef CONFIG_PM_SLEEP
.pm = &brcmf_sdio_pm_ops,
#endif /* CONFIG_PM_SLEEP */
+ .coredump = brcmf_dev_coredump,
},
};

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
index 0b76a61..77d1f34 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bus.h
@@ -249,6 +249,8 @@ int brcmf_bus_get_fwname(struct brcmf_bus *bus, uint chip, uint chiprev,
void brcmf_detach(struct device *dev);
/* Indication from bus module that dongle should be reset */
void brcmf_dev_reset(struct device *dev);
+/* Request from bus module to initiate a coredump */
+int brcmf_dev_coredump(struct device *dev);

/* Configure the "global" bus state used by upper layers */
void brcmf_bus_change_state(struct brcmf_bus *bus, enum brcmf_bus_state state);
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index 930e423..d06ffe0 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -1110,6 +1110,13 @@ void brcmf_dev_reset(struct device *dev)
brcmf_fil_cmd_int_set(drvr->iflist[0], BRCMF_C_TERMINATED, 1);
}

+int brcmf_dev_coredump(struct device *dev)
+{
+ struct brcmf_bus *bus_if = dev_get_drvdata(dev);
+
+ return brcmf_debug_create_memdump(bus_if, NULL, 0);
+}
+
void brcmf_detach(struct device *dev)
{
s32 i;
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
index 8752707..502dd7b 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
@@ -2005,6 +2005,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)
#ifdef CONFIG_PM
.driver.pm = &brcmf_pciedrvr_pm,
#endif
+ .driver.coredump = brcmf_dev_coredump,
};


--
1.9.1

2018-02-21 10:50:39

by Arend Van Spriel

[permalink] [raw]
Subject: [PATCH 3/3] btmrvl: support sysfs initiated firmware coredump

Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
it is possible to initiate a device coredump from user-space. This
patch adds support for it in btmrvl_sdio adding the .coredump()
driver callback. This makes dump through debugfs obsolete so removing
it.

Signed-off-by: Arend van Spriel <[email protected]>
---
drivers/bluetooth/btmrvl_debugfs.c | 31 -------------------------------
drivers/bluetooth/btmrvl_drv.h | 2 --
drivers/bluetooth/btmrvl_main.c | 6 ------
drivers/bluetooth/btmrvl_sdio.c | 18 ++++++++++++------
4 files changed, 12 insertions(+), 45 deletions(-)

diff --git a/drivers/bluetooth/btmrvl_debugfs.c b/drivers/bluetooth/btmrvl_debugfs.c
index 1828ed8..023d35e 100644
--- a/drivers/bluetooth/btmrvl_debugfs.c
+++ b/drivers/bluetooth/btmrvl_debugfs.c
@@ -167,35 +167,6 @@ static ssize_t btmrvl_hscmd_read(struct file *file, char __user *userbuf,
.llseek = default_llseek,
};

-static ssize_t btmrvl_fwdump_write(struct file *file, const char __user *ubuf,
- size_t count, loff_t *ppos)
-{
- struct btmrvl_private *priv = file->private_data;
- char buf[16];
- bool result;
-
- memset(buf, 0, sizeof(buf));
-
- if (copy_from_user(&buf, ubuf, min_t(size_t, sizeof(buf) - 1, count)))
- return -EFAULT;
-
- if (strtobool(buf, &result))
- return -EINVAL;
-
- if (!result)
- return -EINVAL;
-
- btmrvl_firmware_dump(priv);
-
- return count;
-}
-
-static const struct file_operations btmrvl_fwdump_fops = {
- .write = btmrvl_fwdump_write,
- .open = simple_open,
- .llseek = default_llseek,
-};
-
void btmrvl_debugfs_init(struct hci_dev *hdev)
{
struct btmrvl_private *priv = hci_get_drvdata(hdev);
@@ -226,8 +197,6 @@ void btmrvl_debugfs_init(struct hci_dev *hdev)
priv, &btmrvl_hscmd_fops);
debugfs_create_file("hscfgcmd", 0644, dbg->config_dir,
priv, &btmrvl_hscfgcmd_fops);
- debugfs_create_file("fw_dump", 0200, dbg->config_dir,
- priv, &btmrvl_fwdump_fops);

dbg->status_dir = debugfs_create_dir("status", hdev->debugfs);
debugfs_create_u8("curpsmode", 0444, dbg->status_dir,
diff --git a/drivers/bluetooth/btmrvl_drv.h b/drivers/bluetooth/btmrvl_drv.h
index fc3caf4..f045454 100644
--- a/drivers/bluetooth/btmrvl_drv.h
+++ b/drivers/bluetooth/btmrvl_drv.h
@@ -110,7 +110,6 @@ struct btmrvl_private {
u8 *payload, u16 nb);
int (*hw_wakeup_firmware)(struct btmrvl_private *priv);
int (*hw_process_int_status)(struct btmrvl_private *priv);
- void (*firmware_dump)(struct btmrvl_private *priv);
spinlock_t driver_lock; /* spinlock used by driver */
#ifdef CONFIG_DEBUG_FS
void *debugfs_data;
@@ -183,7 +182,6 @@ struct btmrvl_event {
int btmrvl_enable_ps(struct btmrvl_private *priv);
int btmrvl_prepare_command(struct btmrvl_private *priv);
int btmrvl_enable_hs(struct btmrvl_private *priv);
-void btmrvl_firmware_dump(struct btmrvl_private *priv);

#ifdef CONFIG_DEBUG_FS
void btmrvl_debugfs_init(struct hci_dev *hdev);
diff --git a/drivers/bluetooth/btmrvl_main.c b/drivers/bluetooth/btmrvl_main.c
index b280d46..e5cf14d 100644
--- a/drivers/bluetooth/btmrvl_main.c
+++ b/drivers/bluetooth/btmrvl_main.c
@@ -358,12 +358,6 @@ int btmrvl_prepare_command(struct btmrvl_private *priv)
return ret;
}

-void btmrvl_firmware_dump(struct btmrvl_private *priv)
-{
- if (priv->firmware_dump)
- priv->firmware_dump(priv);
-}
-
static int btmrvl_tx_pkt(struct btmrvl_private *priv, struct sk_buff *skb)
{
int ret = 0;
diff --git a/drivers/bluetooth/btmrvl_sdio.c b/drivers/bluetooth/btmrvl_sdio.c
index 7dbb446..0f02025 100644
--- a/drivers/bluetooth/btmrvl_sdio.c
+++ b/drivers/bluetooth/btmrvl_sdio.c
@@ -1311,9 +1311,11 @@ rdwr_status btmrvl_sdio_rdwr_firmware(struct btmrvl_private *priv,
}

/* This function dump sdio register and memory data */
-static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
+static int btmrvl_sdio_coredump(struct device *dev)
{
- struct btmrvl_sdio_card *card = priv->btmrvl_dev.card;
+ struct sdio_func *func = dev_to_sdio_func(dev);
+ struct btmrvl_sdio_card *card;
+ struct btmrvl_private *priv;
int ret = 0;
unsigned int reg, reg_start, reg_end;
enum rdwr_status stat;
@@ -1321,12 +1323,15 @@ static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
u8 dump_num = 0, idx, i, read_reg, doneflag = 0;
u32 memory_size, fw_dump_len = 0;

+ card = sdio_get_drvdata(func);
+ priv = card->priv;
+
/* dump sdio register first */
btmrvl_sdio_dump_regs(priv);

if (!card->supports_fw_dump) {
BT_ERR("Firmware dump not supported for this card!");
- return;
+ return -ENODATA;
}

for (idx = 0; idx < ARRAY_SIZE(mem_type_mapping_tbl); idx++) {
@@ -1445,12 +1450,12 @@ static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
sdio_release_host(card->func);

if (fw_dump_len == 0)
- return;
+ return -ENODATA;

fw_dump_data = vzalloc(fw_dump_len+1);
if (!fw_dump_data) {
BT_ERR("Vzalloc fw_dump_data fail!");
- return;
+ return -ENOMEM;
}
fw_dump_ptr = fw_dump_data;

@@ -1487,6 +1492,7 @@ static void btmrvl_sdio_dump_firmware(struct btmrvl_private *priv)
*/
dev_coredumpv(&card->func->dev, fw_dump_data, fw_dump_len, GFP_KERNEL);
BT_INFO("== btmrvl firmware dump to /sys/class/devcoredump end");
+ return 0;
}

static int btmrvl_sdio_probe(struct sdio_func *func,
@@ -1547,7 +1553,6 @@ static int btmrvl_sdio_probe(struct sdio_func *func,
priv->hw_host_to_card = btmrvl_sdio_host_to_card;
priv->hw_wakeup_firmware = btmrvl_sdio_wakeup_fw;
priv->hw_process_int_status = btmrvl_sdio_process_int_status;
- priv->firmware_dump = btmrvl_sdio_dump_firmware;

if (btmrvl_register_hdev(priv)) {
BT_ERR("Register hdev failed!");
@@ -1717,6 +1722,7 @@ static int btmrvl_sdio_resume(struct device *dev)
.remove = btmrvl_sdio_remove,
.drv = {
.owner = THIS_MODULE,
+ .coredump = btmrvl_sdio_coredump,
.pm = &btmrvl_sdio_pm_ops,
}
};
--
1.9.1

2018-02-23 10:39:39

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

+ Johannes (for devcoredump documentation question).

On 2/22/2018 8:35 PM, Brian Norris wrote:
> Hi Arend,
>
> On Thu, Feb 22, 2018 at 01:17:56PM +0100, Arend van Spriel wrote:
>> On 2/21/2018 11:59 PM, Brian Norris wrote:
>>> On Wed, Feb 21, 2018 at 11:50:19AM +0100, Arend van Spriel wrote:
>>>> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
>>>> it is possible to initiate a device coredump from user-space. This
>>>> patch adds support for it adding the .coredump() driver callback.
>>>> As there is no longer a need to initiate it through debugfs remove
>>>> that code.
>>>>
>>>> Signed-off-by: Arend van Spriel <[email protected]>
>>>> ---
>>>> drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +-------------------------
>>>> drivers/net/wireless/marvell/mwifiex/pcie.c | 19 ++++++++++++++--
>>>> drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++++
>>>> drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++++
>>>> 4 files changed, 45 insertions(+), 32 deletions(-)
>>>
>>> The documentation doesn't really say [1], but is the coredump supposed
>>> to happen synchronously? Because the mwifiex implementation is
>>> asynchronous, whereas it looks like the brcmfmac one is synchronous.
>>
>> Well, that depends on the eye of the beholder I guess. From user-space
>> perspective it is asynchronous regardless. A write access to the coredump
>> sysfs file eventually results in a uevent when the devcoredump entry is
>> created, ie. after driver has made a dev_coredump API call. Whether the
>> driver does that synchronously or asynchronously is irrelevant as far as
>> user-space is concerned.
>
> Is it really? The driver infrastructure seems to guarantee that the
> entirety of a driver's ->coredump() will complete before returning from
> the write. So it might be reasonable for some user to assume (based on
> implementation details, e.g., of brcmfmac) that the devcoredump will be
> ready by the time the write() syscall returns, absent documentation that
> says otherwise. But then, that's not how mwifiex works right now, so
> they might be surprised if they switch drivers.

Ok. I already agreed that the uevent behavior should be documented. The
error handling you are bringing up below was something I realized as
well. I am not familiar with mwifiex to determine what it can say about
the coredump succeeding before scheduling the work.

> Anyway, *I'm* already personally used to these dumps being asynchronous,
> and writing tooling to listen for the uevent instead. But that doesn't
> mean everyone will be.
>
> Also, due to the differences in async/sync, mwifiex doesn't really
> provide you much chance for error handling, because most errors would be
> asynchronous. So brcmfmac's "coredump" has more chance for user programs
> to error-check than mwifiex's (due to the asynchronous nature) [1].
>
> BTW, I push on this mostly because this is migrating from a debugfs
> feature (that is easy to hand-wave off as not really providing a
> consistent/stable API, etc., etc.) to a documented sysfs feature. If it
> were left to rot in debugfs, I probably wouldn't be as bothered ;)

I appreciate it. The documentation is not in the stable ABI folder yet
so I welcome any and all improvements. Might learn a thing or two from it.

>>> Brian
>>>
>>> [1] In fact, the ABI documentation really just describes kernel
>>> internals, rather than documenting any user-facing details, from what I
>>> can tell.
>>
>> You are right. Clearly I did not reach the end my learning curve here. I
>> assumed referring to the existing dev_coredump facility was sufficient, but
>> maybe it is worth a patch to be more explicit and mention the uevent
>> behavior. Also dev_coredump facility may be disabled upon which the trigger
>> will have no effect in sysfs. In the kernel the data passed by the driver is
>> simply freed by dev_coredump facility.
>
> Is there any other documentation for the coredump feature? I don't
> really see much.

Any other than the code itself you mean? I am not sure. Maybe Johannes
knows.

> Brian
>
> [1] Oh wait, but I see that while ->coredump() has an integer return
> code...the caller ignores it:
>
> static ssize_t coredump_store(struct device *dev, struct device_attribute *attr,
> const char *buf, size_t count)
> {
> device_lock(dev);
> if (dev->driver->coredump)
> dev->driver->coredump(dev);
> device_unlock(dev);
>
> return count;
> }
> static DEVICE_ATTR_WO(coredump);
>
> Is that a bug or a feature?

Yeah. Let's call it a bug. Just not sure what to go for. Return the
error or change coredump callback to void return type.

Regards,
Arend

2018-02-22 19:35:52

by Brian Norris

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

Hi Arend,

On Thu, Feb 22, 2018 at 01:17:56PM +0100, Arend van Spriel wrote:
> On 2/21/2018 11:59 PM, Brian Norris wrote:
> > On Wed, Feb 21, 2018 at 11:50:19AM +0100, Arend van Spriel wrote:
> > > Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
> > > it is possible to initiate a device coredump from user-space. This
> > > patch adds support for it adding the .coredump() driver callback.
> > > As there is no longer a need to initiate it through debugfs remove
> > > that code.
> > >
> > > Signed-off-by: Arend van Spriel <[email protected]>
> > > ---
> > > drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +-------------------------
> > > drivers/net/wireless/marvell/mwifiex/pcie.c | 19 ++++++++++++++--
> > > drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++++
> > > drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++++
> > > 4 files changed, 45 insertions(+), 32 deletions(-)
> >
> > The documentation doesn't really say [1], but is the coredump supposed
> > to happen synchronously? Because the mwifiex implementation is
> > asynchronous, whereas it looks like the brcmfmac one is synchronous.
>
> Well, that depends on the eye of the beholder I guess. From user-space
> perspective it is asynchronous regardless. A write access to the coredump
> sysfs file eventually results in a uevent when the devcoredump entry is
> created, ie. after driver has made a dev_coredump API call. Whether the
> driver does that synchronously or asynchronously is irrelevant as far as
> user-space is concerned.

Is it really? The driver infrastructure seems to guarantee that the
entirety of a driver's ->coredump() will complete before returning from
the write. So it might be reasonable for some user to assume (based on
implementation details, e.g., of brcmfmac) that the devcoredump will be
ready by the time the write() syscall returns, absent documentation that
says otherwise. But then, that's not how mwifiex works right now, so
they might be surprised if they switch drivers.

Anyway, *I'm* already personally used to these dumps being asynchronous,
and writing tooling to listen for the uevent instead. But that doesn't
mean everyone will be.

Also, due to the differences in async/sync, mwifiex doesn't really
provide you much chance for error handling, because most errors would be
asynchronous. So brcmfmac's "coredump" has more chance for user programs
to error-check than mwifiex's (due to the asynchronous nature) [1].

BTW, I push on this mostly because this is migrating from a debugfs
feature (that is easy to hand-wave off as not really providing a
consistent/stable API, etc., etc.) to a documented sysfs feature. If it
were left to rot in debugfs, I probably wouldn't be as bothered ;)

> > Brian
> >
> > [1] In fact, the ABI documentation really just describes kernel
> > internals, rather than documenting any user-facing details, from what I
> > can tell.
>
> You are right. Clearly I did not reach the end my learning curve here. I
> assumed referring to the existing dev_coredump facility was sufficient, but
> maybe it is worth a patch to be more explicit and mention the uevent
> behavior. Also dev_coredump facility may be disabled upon which the trigger
> will have no effect in sysfs. In the kernel the data passed by the driver is
> simply freed by dev_coredump facility.

Is there any other documentation for the coredump feature? I don't
really see much.

Brian

[1] Oh wait, but I see that while ->coredump() has an integer return
code...the caller ignores it:

static ssize_t coredump_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
device_lock(dev);
if (dev->driver->coredump)
dev->driver->coredump(dev);
device_unlock(dev);

return count;
}
static DEVICE_ATTR_WO(coredump);

Is that a bug or a feature?

2018-02-26 22:06:15

by Brian Norris

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

Hi,

On Fri, Feb 23, 2018 at 2:51 AM, Johannes Berg
<[email protected]> wrote:
> On Fri, 2018-02-23 at 11:39 +0100, Arend van Spriel wrote:
>
> > > > Well, that depends on the eye of the beholder I guess. From user-space
> > > > perspective it is asynchronous regardless. A write access to the coredump
> > > > sysfs file eventually results in a uevent when the devcoredump entry is
> > > > created, ie. after driver has made a dev_coredump API call. Whether the
> > > > driver does that synchronously or asynchronously is irrelevant as far as
> > > > user-space is concerned.
> > >
> > > Is it really? The driver infrastructure seems to guarantee that the
> > > entirety of a driver's ->coredump() will complete before returning from
> > > the write. So it might be reasonable for some user to assume (based on
> > > implementation details, e.g., of brcmfmac) that the devcoredump will be
> > > ready by the time the write() syscall returns, absent documentation that
> > > says otherwise. But then, that's not how mwifiex works right now, so
> > > they might be surprised if they switch drivers.
>
> I can see how you might want to have that kind of behaviour, but you'd
> have to jump through some hoops to see if the coredump you saw is
> actually the right one - you probably want an asynchronous coredump
> "collector" and then wait for it to show up (with some reasonable
> timeout) on the actual filesystem, not on sysfs?
>
> Otherwise you have to trawl sysfs for the right coredump I guess, which
> too is possible.

It's not that I want that interface. It's that I want the *lack* of
such an interface to be guaranteed in the documentation. When the
questions like "where? when?" are not answered in the doc, users are
totally allowed to speculate ;) Perhaps the "where" can be deferred to
other documentation (which should probably exist someday), but the
"when" should be listed as "eventually; or not at all; listen for a
uevent."

> > > > You are right. Clearly I did not reach the end my learning curve here. I
> > > > assumed referring to the existing dev_coredump facility was sufficient, but
> > > > maybe it is worth a patch to be more explicit and mention the uevent
> > > > behavior. Also dev_coredump facility may be disabled upon which the trigger
> > > > will have no effect in sysfs. In the kernel the data passed by the driver is
> > > > simply freed by dev_coredump facility.
> > >
> > > Is there any other documentation for the coredump feature? I don't
> > > really see much.
> >
> > Any other than the code itself you mean? I am not sure. Maybe Johannes
> > knows.
>
> There isn't really, it originally was really simple, but then somebody
> (Kees perhaps?) requested a way to turn it off forever for security or
> privacy concerns and it became more complicated.

Then I don't think when adding a new sysfs ABI, we should be deferring
to "existing dev_coredump facility [documentation]" (which doesn't
exist). And just a few words about the user-facing interface would be
nice for the documentation. There previously wasn't any official way
to trigger a dump from userspace -- only from random debugfs files, I
think, or from unspecified device failures.

> > > static ssize_t coredump_store(struct device *dev, struct device_attribute *attr,
> > > const char *buf, size_t count)
> > > {
> > > device_lock(dev);
> > > if (dev->driver->coredump)
> > > dev->driver->coredump(dev);
> > > device_unlock(dev);
> > >
> > > return count;
> > > }
> > > static DEVICE_ATTR_WO(coredump);
> > >
> > > Is that a bug or a feature?
> >
> > Yeah. Let's call it a bug. Just not sure what to go for. Return the
> > error or change coredump callback to void return type.
>
> I'm not sure it matters all that much - the underlying devcoredump
> calls all have no return value (void), and given the above complexities
> with the ability to turn off devcoredumping entirely you cannot rely on
> this return value to tell you if a dump was created or not, at least
> not without much more infrastructure work.

Then perhaps it makes sense to remove the return code before you
create users of it.

Brian

2018-02-21 10:50:38

by Arend Van Spriel

[permalink] [raw]
Subject: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
it is possible to initiate a device coredump from user-space. This
patch adds support for it adding the .coredump() driver callback.
As there is no longer a need to initiate it through debugfs remove
that code.

Signed-off-by: Arend van Spriel <[email protected]>
---
drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +-------------------------
drivers/net/wireless/marvell/mwifiex/pcie.c | 19 ++++++++++++++--
drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++++
drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++++
4 files changed, 45 insertions(+), 32 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/debugfs.c b/drivers/net/wireless/marvell/mwifiex/debugfs.c
index db2872d..0745393 100644
--- a/drivers/net/wireless/marvell/mwifiex/debugfs.c
+++ b/drivers/net/wireless/marvell/mwifiex/debugfs.c
@@ -154,34 +154,6 @@
}

/*
- * Proc device dump read handler.
- *
- * This function is called when the 'device_dump' file is opened for
- * reading.
- * This function dumps driver information and firmware memory segments
- * (ex. DTCM, ITCM, SQRAM etc.) for
- * debugging.
- */
-static ssize_t
-mwifiex_device_dump_read(struct file *file, char __user *ubuf,
- size_t count, loff_t *ppos)
-{
- struct mwifiex_private *priv = file->private_data;
-
- /* For command timeouts, USB firmware will automatically emit
- * firmware dump events, so we don't implement device_dump().
- * For user-initiated dumps, we trigger it ourselves.
- */
- if (priv->adapter->iface_type == MWIFIEX_USB)
- mwifiex_send_cmd(priv, HostCmd_CMD_FW_DUMP_EVENT,
- HostCmd_ACT_GEN_SET, 0, NULL, true);
- else
- priv->adapter->if_ops.device_dump(priv->adapter);
-
- return 0;
-}
-
-/*
* Proc getlog file read handler.
*
* This function is called when the 'getlog' file is opened for reading
@@ -980,7 +952,6 @@
MWIFIEX_DFS_FILE_READ_OPS(info);
MWIFIEX_DFS_FILE_READ_OPS(debug);
MWIFIEX_DFS_FILE_READ_OPS(getlog);
-MWIFIEX_DFS_FILE_READ_OPS(device_dump);
MWIFIEX_DFS_FILE_OPS(regrdwr);
MWIFIEX_DFS_FILE_OPS(rdeeprom);
MWIFIEX_DFS_FILE_OPS(memrw);
@@ -1011,7 +982,7 @@
MWIFIEX_DFS_ADD_FILE(getlog);
MWIFIEX_DFS_ADD_FILE(regrdwr);
MWIFIEX_DFS_ADD_FILE(rdeeprom);
- MWIFIEX_DFS_ADD_FILE(device_dump);
+
MWIFIEX_DFS_ADD_FILE(memrw);
MWIFIEX_DFS_ADD_FILE(hscfg);
MWIFIEX_DFS_ADD_FILE(histogram);
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 97a6199..0959526 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -320,6 +320,20 @@ static void mwifiex_pcie_shutdown(struct pci_dev *pdev)
return;
}

+static int mwifiex_pcie_coredump(struct device *dev)
+{
+ struct pci_dev *pdev;
+ struct pcie_service_card *card;
+
+ pdev = container_of(dev, struct pci_dev, dev);
+ card = pci_get_drvdata(pdev);
+
+ if (!test_and_set_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP,
+ &card->work_flags))
+ schedule_work(&card->work);
+ return 0;
+}
+
static const struct pci_device_id mwifiex_ids[] = {
{
PCIE_VENDOR_ID_MARVELL, PCIE_DEVICE_ID_MARVELL_88W8766P,
@@ -415,11 +429,12 @@ static SIMPLE_DEV_PM_OPS(mwifiex_pcie_pm_ops, mwifiex_pcie_suspend,
.id_table = mwifiex_ids,
.probe = mwifiex_pcie_probe,
.remove = mwifiex_pcie_remove,
-#ifdef CONFIG_PM_SLEEP
.driver = {
+ .coredump = mwifiex_pcie_coredump,
+#ifdef CONFIG_PM_SLEEP
.pm = &mwifiex_pcie_pm_ops,
- },
#endif
+ },
.shutdown = mwifiex_pcie_shutdown,
.err_handler = &mwifiex_pcie_err_handler,
};
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index a828801..24b9ff3 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -466,6 +466,18 @@ static int mwifiex_sdio_suspend(struct device *dev)
return ret;
}

+static int mwifiex_sdio_coredump(struct device *dev)
+{
+ struct sdio_func *func = dev_to_sdio_func(dev);
+ struct sdio_mmc_card *card;
+
+ card = sdio_get_drvdata(func);
+ if (!test_and_set_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP,
+ &card->work_flags))
+ schedule_work(&card->work);
+ return 0;
+}
+
/* Device ID for SD8786 */
#define SDIO_DEVICE_ID_MARVELL_8786 (0x9116)
/* Device ID for SD8787 */
@@ -515,6 +527,7 @@ static int mwifiex_sdio_suspend(struct device *dev)
.remove = mwifiex_sdio_remove,
.drv = {
.owner = THIS_MODULE,
+ .coredump = mwifiex_sdio_coredump,
.pm = &mwifiex_sdio_pm_ops,
}
};
diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
index 4bc2448..83815f6 100644
--- a/drivers/net/wireless/marvell/mwifiex/usb.c
+++ b/drivers/net/wireless/marvell/mwifiex/usb.c
@@ -653,6 +653,17 @@ static void mwifiex_usb_disconnect(struct usb_interface *intf)
usb_put_dev(interface_to_usbdev(intf));
}

+static int mwifiex_usb_coredump(struct device *dev)
+{
+ struct usb_interface *intf = to_usb_interface(dev);
+ struct usb_card_rec *card = usb_get_intfdata(intf);
+
+ mwifiex_send_cmd(mwifiex_get_priv(card->adapter, MWIFIEX_BSS_ROLE_ANY),
+ HostCmd_CMD_FW_DUMP_EVENT, HostCmd_ACT_GEN_SET, 0,
+ NULL, true);
+ return 0;
+}
+
static struct usb_driver mwifiex_usb_driver = {
.name = "mwifiex_usb",
.probe = mwifiex_usb_probe,
@@ -661,6 +672,9 @@ static void mwifiex_usb_disconnect(struct usb_interface *intf)
.suspend = mwifiex_usb_suspend,
.resume = mwifiex_usb_resume,
.soft_unbind = 1,
+ .drvwrap.driver = {
+ .coredump = mwifiex_usb_coredump,
+ },
};

static int mwifiex_write_data_sync(struct mwifiex_adapter *adapter, u8 *pbuf,
--
1.9.1

2018-02-26 22:25:07

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

On 2/26/2018 11:06 PM, Brian Norris wrote:
> Hi,
>
> On Fri, Feb 23, 2018 at 2:51 AM, Johannes Berg
> <[email protected]> wrote:
>> On Fri, 2018-02-23 at 11:39 +0100, Arend van Spriel wrote:
>>
>>>>> Well, that depends on the eye of the beholder I guess. From user-space
>>>>> perspective it is asynchronous regardless. A write access to the coredump
>>>>> sysfs file eventually results in a uevent when the devcoredump entry is
>>>>> created, ie. after driver has made a dev_coredump API call. Whether the
>>>>> driver does that synchronously or asynchronously is irrelevant as far as
>>>>> user-space is concerned.
>>>>
>>>> Is it really? The driver infrastructure seems to guarantee that the
>>>> entirety of a driver's ->coredump() will complete before returning from
>>>> the write. So it might be reasonable for some user to assume (based on
>>>> implementation details, e.g., of brcmfmac) that the devcoredump will be
>>>> ready by the time the write() syscall returns, absent documentation that
>>>> says otherwise. But then, that's not how mwifiex works right now, so
>>>> they might be surprised if they switch drivers.
>>
>> I can see how you might want to have that kind of behaviour, but you'd
>> have to jump through some hoops to see if the coredump you saw is
>> actually the right one - you probably want an asynchronous coredump
>> "collector" and then wait for it to show up (with some reasonable
>> timeout) on the actual filesystem, not on sysfs?
>>
>> Otherwise you have to trawl sysfs for the right coredump I guess, which
>> too is possible.
>
> It's not that I want that interface. It's that I want the *lack* of
> such an interface to be guaranteed in the documentation. When the
> questions like "where? when?" are not answered in the doc, users are
> totally allowed to speculate ;) Perhaps the "where" can be deferred to
> other documentation (which should probably exist someday), but the
> "when" should be listed as "eventually; or not at all; listen for a
> uevent."

Agree. Will extend/improve the ABI documentation.

>>>>> You are right. Clearly I did not reach the end my learning curve here. I
>>>>> assumed referring to the existing dev_coredump facility was sufficient, but
>>>>> maybe it is worth a patch to be more explicit and mention the uevent
>>>>> behavior. Also dev_coredump facility may be disabled upon which the trigger
>>>>> will have no effect in sysfs. In the kernel the data passed by the driver is
>>>>> simply freed by dev_coredump facility.
>>>>
>>>> Is there any other documentation for the coredump feature? I don't
>>>> really see much.
>>>
>>> Any other than the code itself you mean? I am not sure. Maybe Johannes
>>> knows.
>>
>> There isn't really, it originally was really simple, but then somebody
>> (Kees perhaps?) requested a way to turn it off forever for security or
>> privacy concerns and it became more complicated.
>
> Then I don't think when adding a new sysfs ABI, we should be deferring
> to "existing dev_coredump facility [documentation]" (which doesn't
> exist). And just a few words about the user-facing interface would be
> nice for the documentation. There previously wasn't any official way
> to trigger a dump from userspace -- only from random debugfs files, I
> think, or from unspecified device failures.

That was my main motivation to have this. The debugfs method did not
feel quite right as there is no kconfig dependency between dev_coredump
and debugfs. Now I discussed with Johannes about adding code into the
dev_coredump facility, but that seemed to add a lot of complexity. So I
looking into the device driver core and found it to be the simpler solution.

>>>> static ssize_t coredump_store(struct device *dev, struct device_attribute *attr,
>>>> const char *buf, size_t count)
>>>> {
>>>> device_lock(dev);
>>>> if (dev->driver->coredump)
>>>> dev->driver->coredump(dev);
>>>> device_unlock(dev);
>>>>
>>>> return count;
>>>> }
>>>> static DEVICE_ATTR_WO(coredump);
>>>>
>>>> Is that a bug or a feature?
>>>
>>> Yeah. Let's call it a bug. Just not sure what to go for. Return the
>>> error or change coredump callback to void return type.
>>
>> I'm not sure it matters all that much - the underlying devcoredump
>> calls all have no return value (void), and given the above complexities
>> with the ability to turn off devcoredumping entirely you cannot rely on
>> this return value to tell you if a dump was created or not, at least
>> not without much more infrastructure work.
>
> Then perhaps it makes sense to remove the return code before you
> create users of it.

Yup. Will sent out a patch for that as well.

Thanks,
Arend

2018-02-22 12:17:59

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

On 2/21/2018 11:59 PM, Brian Norris wrote:
> On Wed, Feb 21, 2018 at 11:50:19AM +0100, Arend van Spriel wrote:
>> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
>> it is possible to initiate a device coredump from user-space. This
>> patch adds support for it adding the .coredump() driver callback.
>> As there is no longer a need to initiate it through debugfs remove
>> that code.
>>
>> Signed-off-by: Arend van Spriel <[email protected]>
>> ---
>> drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +-------------------------
>> drivers/net/wireless/marvell/mwifiex/pcie.c | 19 ++++++++++++++--
>> drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++++
>> drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++++
>> 4 files changed, 45 insertions(+), 32 deletions(-)
>
> The documentation doesn't really say [1], but is the coredump supposed
> to happen synchronously? Because the mwifiex implementation is
> asynchronous, whereas it looks like the brcmfmac one is synchronous.

Well, that depends on the eye of the beholder I guess. From user-space
perspective it is asynchronous regardless. A write access to the
coredump sysfs file eventually results in a uevent when the devcoredump
entry is created, ie. after driver has made a dev_coredump API call.
Whether the driver does that synchronously or asynchronously is
irrelevant as far as user-space is concerned.

> Brian
>
> [1] In fact, the ABI documentation really just describes kernel
> internals, rather than documenting any user-facing details, from what I
> can tell.

You are right. Clearly I did not reach the end my learning curve here. I
assumed referring to the existing dev_coredump facility was sufficient,
but maybe it is worth a patch to be more explicit and mention the uevent
behavior. Also dev_coredump facility may be disabled upon which the
trigger will have no effect in sysfs. In the kernel the data passed by
the driver is simply freed by dev_coredump facility.

Regards,
Arend

2018-02-21 22:59:08

by Brian Norris

[permalink] [raw]
Subject: Re: [PATCH 2/3] mwifiex: support sysfs initiated device coredump

On Wed, Feb 21, 2018 at 11:50:19AM +0100, Arend van Spriel wrote:
> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
> it is possible to initiate a device coredump from user-space. This
> patch adds support for it adding the .coredump() driver callback.
> As there is no longer a need to initiate it through debugfs remove
> that code.
>
> Signed-off-by: Arend van Spriel <[email protected]>
> ---
> drivers/net/wireless/marvell/mwifiex/debugfs.c | 31 +-------------------------
> drivers/net/wireless/marvell/mwifiex/pcie.c | 19 ++++++++++++++--
> drivers/net/wireless/marvell/mwifiex/sdio.c | 13 +++++++++++
> drivers/net/wireless/marvell/mwifiex/usb.c | 14 ++++++++++++
> 4 files changed, 45 insertions(+), 32 deletions(-)

The documentation doesn't really say [1], but is the coredump supposed
to happen synchronously? Because the mwifiex implementation is
asynchronous, whereas it looks like the brcmfmac one is synchronous.

Brian

[1] In fact, the ABI documentation really just describes kernel
internals, rather than documenting any user-facing details, from what I
can tell.

2018-03-12 09:41:15

by Kalle Valo

[permalink] [raw]
Subject: Re: [2/3] mwifiex: support sysfs initiated device coredump

Arend Van Spriel <[email protected]> wrote:

> Since commit 3c47d19ff4dc ("drivers: base: add coredump driver ops")
> it is possible to initiate a device coredump from user-space. This
> patch adds support for it adding the .coredump() driver callback.
> As there is no longer a need to initiate it through debugfs remove
> that code.
>
> Signed-off-by: Arend van Spriel <[email protected]>

Based on the discussion I assume this is ok to take to w-d-next. If that's not
the case, please let me know ASAP.

--
https://patchwork.kernel.org/patch/10231933/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches