LinuxLists.cc - [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

2021-07-19 15:03:35

Subject: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

The BAM Data Multiplexer provides access to the network data channels
of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
or MSM8974. This series adds a driver that allows using it.

For more information about BAM-DMUX, see PATCH 4/4.

Shortly said, BAM-DMUX is built using a simple protocol layer on top of
a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
a quite strange mode that I call "remote power collapse", where the
modem/remote side is responsible for powering on the BAM when needed but we
are responsible to initialize it. The BAM is power-collapsed when unneeded
by coordinating power control via bidirectional interrupts from the
BAM-DMUX driver.

The series first adds one possible solution for handling this "remote power
collapse" mode in the bam_dma driver, then it adds the BAM-DMUX driver to
the WWAN subsystem. Note that the BAM-DMUX driver does not actually make
use of the WWAN subsystem yet, since I'm not sure how to fit it in there
yet (see PATCH 4/4).

Please note that all of the changes in this patch series are based on
a fairly complicated driver from Qualcomm [1].
I do not have access to any documentation about "BAM-DMUX". :(

The driver has been used in postmarketOS [2] on various smartphones/tablets
based on Qualcomm MSM8916 and MSM8974 for a year now with no reported
problems.

At runtime (but not compile-time), the following two patches are needed
additionally for full functionality:
- https://lore.kernel.org/linux-arm-msm/[email protected]/
- https://lore.kernel.org/linux-arm-msm/[email protected]/

[1]: https://source.codeaurora.org/quic/la/kernel/msm-3.10/tree/drivers/soc/qcom/bam_dmux.c?h=LA.BR.1.2.9.1-02310-8x16.0
[2]: https://postmarketos.org/

Stephan Gerhold (4):
dt-bindings: dmaengine: bam_dma: Add remote power collapse mode
dmaengine: qcom: bam_dma: Add remote power collapse mode
dt-bindings: net: Add schema for Qualcomm BAM-DMUX
net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

.../devicetree/bindings/dma/qcom_bam_dma.txt | 2 +
.../bindings/net/qcom,bam-dmux.yaml | 87 ++
MAINTAINERS | 8 +
drivers/dma/qcom/bam_dma.c | 88 +-
drivers/net/wwan/Kconfig | 13 +
drivers/net/wwan/Makefile | 1 +
drivers/net/wwan/qcom_bam_dmux.c | 907 ++++++++++++++++++
7 files changed, 1074 insertions(+), 32 deletions(-)
create mode 100644 Documentation/devicetree/bindings/net/qcom,bam-dmux.yaml
create mode 100644 drivers/net/wwan/qcom_bam_dmux.c

--
2.32.0

2021-07-19 15:04:21

by Stephan Gerhold

[permalink] [raw]

Subject: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

The BAM Data Multiplexer provides access to the network data channels of
modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916 or
MSM8974. It is built using a simple protocol layer on top of a DMA engine
(Qualcomm BAM) and bidirectional interrupts to coordinate power control.

The modem announces a fixed set of channels by sending an OPEN command.
The driver exports each channel as separate network interface so that
a connection can be established via QMI from userspace. The network
interface can work either in Ethernet or Raw-IP mode (configurable via
QMI). However, Ethernet mode seems to be broken with most firmwares
(network packets are actually received as Raw-IP), therefore the driver
only supports Raw-IP mode.

The driver uses runtime PM to coordinate power control with the modem.
TX/RX buffers are put in a kind of "ring queue" and submitted via
the bam_dma driver of the DMAEngine subsystem.

The basic architecture looks roughly like this:

+------------+ +-------+
[IPv4/6] | BAM-DMUX | | |
[Data...] | | | |
---------->|rmnet0 | [DMUX chan: x] | |
[IPv4/6] | (chan: 0) | [IPv4/6] | |
[Data...] | | [Data...] | |
---------->|rmnet1 |--------------->| Modem |
| (chan: 1) | BAM | |
[IPv4/6] | ... | (DMA Engine) | |
[Data...] | | | |
---------->|rmnet7 | | |
| (chan: 7) | | |
+------------+ +-------+

However, on newer SoCs/firmware versions Qualcomm began gradually moving
to QMAP (rmnet driver) as backend-independent protocol for multiplexing
and data aggegration. Some firmware versions allow using QMAP on top of
BAM-DMUX (effectively resulting in a second multiplexing layer plus data
aggregation). The architecture with QMAP would look roughly like this:

+-------------+ +------------+ +-------+
[IPv4/6] | RMNET | | BAM-DMUX | | |
[Data...] | | | | [DMUX chan: 0] | |
---------->|rmnet_data1 | ----->|rmnet0 | [QMAP mux-id: x] | |
| (mux-id: 1) | | | (chan: 0) | [IPv4/6] | |
| | | | | [Data...] | |
[IPv4/6] | ... |------ | |----------------->| Modem |
[Data...] | | | | BAM | |
---------->|rmnet_data42 | [QMAP: x] |[rmnet1] | (DMA Engine) | |
| (mux-id: 42)| [IPv4/6] |... unused! | | |
| | [Data...] |[rmnet7] | | |
| | | | | |
+-------------+ +------------+ +-------+

In this case, rmnet1-7 would remain unused. The firmware used on the most
recent SoCs with BAM-DMUX even seems to announce only a single BAM-DMUX
channel (rmnet0), which makes QMAP the only option for multiplexing there.

So far the driver is mainly tested on various smartphones/tablets based on
Qualcomm MSM8916/MSM8974 without QMAP. It looks like QMAP depends on a MTU
negotiation feature in BAM-DMUX which is not yet supported by the driver.

Signed-off-by: Stephan Gerhold <[email protected]>
---
Note that this is my first network driver, so I apologize in advance
if I made some obvious mistakes. :)

I'm not sure how to integrate the driver with the WWAN subsystem yet.
At the moment the driver creates network interfaces for all channels
announced by the modem, it does not make use of the WWAN link management
yet. Unfortunately, this is a bit complicated:

Both QMAP and the built-in multiplexing layer might be needed at some point.
There are firmware versions that do not support QMAP and the other way around
(the built-in multiplexing was disabled on very recent firmware versions).
Only userspace can check if QMAP is supported in the firmware (via QMI).

I could ignore QMAP completely for now but I think someone will show up
who will need this eventually. And if there is going to be common code for
QMAP/rmnet link management it would be nice if BAM-DMUX could also make
use of it.

But the question is, how could this look like? How do we know if we should
create a link for QMAP or a BAM-DMUX channel? Does it even make sense
to manage the 1-8 channels via the WWAN link management?

Another problem is that the WWAN subsystem currently creates all network
interfaces below the common WWAN device. This means that userspace like
ModemManager has no way to check which driver provides them. This is
necessary though to decide how to set it up via QMI (ModemManager uses it).

For reference, example of the channels announced by firmwares on various SoCs:
- Qualcomm MSM8974: channel 0-7, QMAP not supported
- Qualcomm MSM8916: channel 0-7, QMAP usually supported, but not always
(depends on firmware version)
- Qualcomm MSM8937: channel 0 only, QMAP required for multiplexing(?)
(Note: This one is theoretic based on logs, this was not tested so far...)
---
MAINTAINERS | 8 +
drivers/net/wwan/Kconfig | 13 +
drivers/net/wwan/Makefile | 1 +
drivers/net/wwan/qcom_bam_dmux.c | 907 +++++++++++++++++++++++++++++++
4 files changed, 929 insertions(+)
create mode 100644 drivers/net/wwan/qcom_bam_dmux.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e09c3944240c..0d7d2fbadfb2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15304,6 +15304,14 @@ S: Supported
W: https://wireless.wiki.kernel.org/en/users/Drivers/ath9k
F: drivers/net/wireless/ath/ath9k/

+QUALCOMM BAM-DMUX WWAN NETWORK DRIVER
+M: Stephan Gerhold <[email protected]>
+L: [email protected]
+L: [email protected]
+S: Maintained
+F: Documentation/devicetree/bindings/net/qcom,bam-dmux.yaml
+F: drivers/net/wwan/bam_dmux.c
+
QUALCOMM CAMERA SUBSYSTEM DRIVER
M: Robert Foss <[email protected]>
M: Todor Tomov <[email protected]>
diff --git a/drivers/net/wwan/Kconfig b/drivers/net/wwan/Kconfig
index de9384326bc8..efb2b859ab55 100644
--- a/drivers/net/wwan/Kconfig
+++ b/drivers/net/wwan/Kconfig
@@ -38,6 +38,19 @@ config MHI_WWAN_CTRL
To compile this driver as a module, choose M here: the module will be
called mhi_wwan_ctrl.

+config QCOM_BAM_DMUX
+ tristate "Qualcomm BAM-DMUX WWAN network driver"
+ depends on (DMA_ENGINE && PM && QCOM_SMEM_STATE) || COMPILE_TEST
+ help
+ The BAM Data Multiplexer provides access to the network data channels
+ of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm
+ MSM8916 or MSM8974. The connection can be established via QMI/AT from
+ userspace with control ports available through the WWAN subsystem
+ (CONFIG_RPMSG_WWAN_CTRL) or QRTR network sockets (CONFIG_QRTR).
+
+ To compile this driver as a module, choose M here: the module will be
+ called qcom_bam_dmux.
+
config RPMSG_WWAN_CTRL
tristate "RPMSG WWAN control driver"
depends on RPMSG
diff --git a/drivers/net/wwan/Makefile b/drivers/net/wwan/Makefile
index d90ac33abaef..a804f6d9637e 100644
--- a/drivers/net/wwan/Makefile
+++ b/drivers/net/wwan/Makefile
@@ -9,5 +9,6 @@ wwan-objs += wwan_core.o
obj-$(CONFIG_WWAN_HWSIM) += wwan_hwsim.o

obj-$(CONFIG_MHI_WWAN_CTRL) += mhi_wwan_ctrl.o
+obj-$(CONFIG_QCOM_BAM_DMUX) += qcom_bam_dmux.o
obj-$(CONFIG_RPMSG_WWAN_CTRL) += rpmsg_wwan_ctrl.o
obj-$(CONFIG_IOSM) += iosm/
diff --git a/drivers/net/wwan/qcom_bam_dmux.c b/drivers/net/wwan/qcom_bam_dmux.c
new file mode 100644
index 000000000000..b1e69f5263ac
--- /dev/null
+++ b/drivers/net/wwan/qcom_bam_dmux.c
@@ -0,0 +1,907 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Qualcomm BAM-DMUX WWAN network driver
+ * Copyright (c) 2020, Stephan Gerhold <[email protected]>
+ */
+
+#include <linux/atomic.h>
+#include <linux/bitops.h>
+#include <linux/completion.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
+#include <linux/if_arp.h>
+#include <linux/interrupt.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/soc/qcom/smem_state.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+#include <net/pkt_sched.h>
+
+#define BAM_DMUX_BUFFER_SIZE SZ_2K
+#define BAM_DMUX_HDR_SIZE sizeof(struct bam_dmux_hdr)
+#define BAM_DMUX_MAX_DATA_SIZE (BAM_DMUX_BUFFER_SIZE - BAM_DMUX_HDR_SIZE)
+#define BAM_DMUX_NUM_SKB 32
+
+#define BAM_DMUX_HDR_MAGIC 0x33fc
+
+#define BAM_DMUX_AUTOSUSPEND_DELAY 1000
+#define BAM_DMUX_REMOTE_TIMEOUT msecs_to_jiffies(2000)
+
+enum {
+ BAM_DMUX_CMD_DATA,
+ BAM_DMUX_CMD_OPEN,
+ BAM_DMUX_CMD_CLOSE,
+};
+
+enum {
+ BAM_DMUX_CH_DATA_0,
+ BAM_DMUX_CH_DATA_1,
+ BAM_DMUX_CH_DATA_2,
+ BAM_DMUX_CH_DATA_3,
+ BAM_DMUX_CH_DATA_4,
+ BAM_DMUX_CH_DATA_5,
+ BAM_DMUX_CH_DATA_6,
+ BAM_DMUX_CH_DATA_7,
+ BAM_DMUX_NUM_CH
+};
+
+struct bam_dmux_hdr {
+ u16 magic;
+ u8 signal;
+ u8 cmd;
+ u8 pad;
+ u8 ch;
+ u16 len;
+};
+
+struct bam_dmux_skb_dma {
+ struct bam_dmux *dmux;
+ struct sk_buff *skb;
+ dma_addr_t addr;
+};
+
+struct bam_dmux {
+ struct device *dev;
+
+ int pc_irq;
+ bool pc_state, pc_ack_state;
+ struct qcom_smem_state *pc, *pc_ack;
+ u32 pc_mask, pc_ack_mask;
+ wait_queue_head_t pc_wait;
+ struct completion pc_ack_completion;
+
+ struct dma_chan *rx, *tx;
+ struct bam_dmux_skb_dma rx_skbs[BAM_DMUX_NUM_SKB];
+ struct bam_dmux_skb_dma tx_skbs[BAM_DMUX_NUM_SKB];
+ spinlock_t tx_lock; /* Protect tx_skbs, tx_next_skb */
+ unsigned int tx_next_skb;
+ atomic_long_t tx_deferred_skb;
+ struct work_struct tx_wakeup_work;
+
+ DECLARE_BITMAP(remote_channels, BAM_DMUX_NUM_CH);
+ struct work_struct register_netdev_work;
+ struct net_device *netdevs[BAM_DMUX_NUM_CH];
+};
+
+struct bam_dmux_netdev {
+ struct bam_dmux *dmux;
+ u8 ch;
+};
+
+static void bam_dmux_pc_vote(struct bam_dmux *dmux, bool enable)
+{
+ reinit_completion(&dmux->pc_ack_completion);
+ qcom_smem_state_update_bits(dmux->pc, dmux->pc_mask,
+ enable ? dmux->pc_mask : 0);
+}
+
+static void bam_dmux_pc_ack(struct bam_dmux *dmux)
+{
+ qcom_smem_state_update_bits(dmux->pc_ack, dmux->pc_ack_mask,
+ dmux->pc_ack_state ? 0 : dmux->pc_ack_mask);
+ dmux->pc_ack_state = !dmux->pc_ack_state;
+}
+
+static bool bam_dmux_skb_dma_map(struct bam_dmux_skb_dma *skb_dma,
+ enum dma_data_direction dir)
+{
+ struct device *dev = skb_dma->dmux->dev;
+
+ skb_dma->addr = dma_map_single(dev, skb_dma->skb->data, skb_dma->skb->len, dir);
+ if (dma_mapping_error(dev, skb_dma->addr)) {
+ dev_err(dev, "Failed to DMA map buffer\n");
+ skb_dma->addr = 0;
+ return false;
+ }
+
+ return true;
+}
+
+static void bam_dmux_skb_dma_unmap(struct bam_dmux_skb_dma *skb_dma,
+ enum dma_data_direction dir)
+{
+ dma_unmap_single(skb_dma->dmux->dev, skb_dma->addr, skb_dma->skb->len, dir);
+ skb_dma->addr = 0;
+}
+
+static void bam_dmux_tx_wake_queues(struct bam_dmux *dmux)
+{
+ int i;
+
+ dev_dbg(dmux->dev, "wake queues\n");
+
+ for (i = 0; i < BAM_DMUX_NUM_CH; ++i) {
+ struct net_device *netdev = dmux->netdevs[i];
+
+ if (netdev && netif_running(netdev))
+ netif_wake_queue(netdev);
+ }
+}
+
+static void bam_dmux_tx_stop_queues(struct bam_dmux *dmux)
+{
+ int i;
+
+ dev_dbg(dmux->dev, "stop queues\n");
+
+ for (i = 0; i < BAM_DMUX_NUM_CH; ++i) {
+ struct net_device *netdev = dmux->netdevs[i];
+
+ if (netdev)
+ netif_stop_queue(netdev);
+ }
+}
+
+static void bam_dmux_tx_done(struct bam_dmux_skb_dma *skb_dma)
+{
+ struct bam_dmux *dmux = skb_dma->dmux;
+ unsigned long flags;
+
+ pm_runtime_mark_last_busy(dmux->dev);
+ pm_runtime_put_autosuspend(dmux->dev);
+
+ if (skb_dma->addr)
+ bam_dmux_skb_dma_unmap(skb_dma, DMA_TO_DEVICE);
+
+ spin_lock_irqsave(&dmux->tx_lock, flags);
+ skb_dma->skb = NULL;
+ if (skb_dma == &dmux->tx_skbs[dmux->tx_next_skb % BAM_DMUX_NUM_SKB])
+ bam_dmux_tx_wake_queues(dmux);
+ spin_unlock_irqrestore(&dmux->tx_lock, flags);
+}
+
+static void bam_dmux_tx_callback(void *data)
+{
+ struct bam_dmux_skb_dma *skb_dma = data;
+ struct sk_buff *skb = skb_dma->skb;
+
+ bam_dmux_tx_done(skb_dma);
+ dev_consume_skb_any(skb);
+}
+
+static bool bam_dmux_skb_dma_submit_tx(struct bam_dmux_skb_dma *skb_dma)
+{
+ struct bam_dmux *dmux = skb_dma->dmux;
+ struct dma_async_tx_descriptor *desc;
+
+ desc = dmaengine_prep_slave_single(dmux->tx, skb_dma->addr,
+ skb_dma->skb->len, DMA_MEM_TO_DEV,
+ DMA_PREP_INTERRUPT);
+ if (!desc) {
+ dev_err(dmux->dev, "Failed to prepare TX DMA buffer\n");
+ return false;
+ }
+
+ desc->callback = bam_dmux_tx_callback;
+ desc->callback_param = skb_dma;
+ desc->cookie = dmaengine_submit(desc);
+ return true;
+}
+
+static struct bam_dmux_skb_dma *
+bam_dmux_tx_queue(struct bam_dmux *dmux, struct sk_buff *skb)
+{
+ struct bam_dmux_skb_dma *skb_dma;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dmux->tx_lock, flags);
+
+ skb_dma = &dmux->tx_skbs[dmux->tx_next_skb % BAM_DMUX_NUM_SKB];
+ if (skb_dma->skb) {
+ bam_dmux_tx_stop_queues(dmux);
+ spin_unlock_irqrestore(&dmux->tx_lock, flags);
+ return NULL;
+ }
+ skb_dma->skb = skb;
+
+ dmux->tx_next_skb++;
+ if (dmux->tx_skbs[dmux->tx_next_skb % BAM_DMUX_NUM_SKB].skb)
+ bam_dmux_tx_stop_queues(dmux);
+
+ spin_unlock_irqrestore(&dmux->tx_lock, flags);
+ return skb_dma;
+}
+
+static int bam_dmux_send_cmd(struct bam_dmux_netdev *bndev, u8 cmd)
+{
+ struct bam_dmux *dmux = bndev->dmux;
+ struct bam_dmux_skb_dma *skb_dma;
+ struct bam_dmux_hdr *hdr;
+ struct sk_buff *skb;
+ int ret;
+
+ skb = alloc_skb(sizeof(*hdr), GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ hdr = skb_put_zero(skb, sizeof(*hdr));
+ hdr->magic = BAM_DMUX_HDR_MAGIC;
+ hdr->cmd = cmd;
+ hdr->ch = bndev->ch;
+
+ skb_dma = bam_dmux_tx_queue(dmux, skb);
+ if (!skb_dma) {
+ ret = -EAGAIN;
+ goto free_skb;
+ }
+
+ ret = pm_runtime_get_sync(dmux->dev);
+ if (ret < 0)
+ goto tx_fail;
+
+ if (!bam_dmux_skb_dma_map(skb_dma, DMA_TO_DEVICE)) {
+ ret = -ENOMEM;
+ goto tx_fail;
+ }
+
+ if (!bam_dmux_skb_dma_submit_tx(skb_dma)) {
+ ret = -EIO;
+ goto tx_fail;
+ }
+
+ dma_async_issue_pending(dmux->tx);
+ return 0;
+
+tx_fail:
+ bam_dmux_tx_done(skb_dma);
+free_skb:
+ dev_kfree_skb(skb);
+ return ret;
+}
+
+static int bam_dmux_netdev_open(struct net_device *netdev)
+{
+ struct bam_dmux_netdev *bndev = netdev_priv(netdev);
+ int ret;
+
+ ret = bam_dmux_send_cmd(bndev, BAM_DMUX_CMD_OPEN);
+ if (ret)
+ return ret;
+
+ netif_start_queue(netdev);
+ return 0;
+}
+
+static int bam_dmux_netdev_stop(struct net_device *netdev)
+{
+ struct bam_dmux_netdev *bndev = netdev_priv(netdev);
+
+ netif_stop_queue(netdev);
+ bam_dmux_send_cmd(bndev, BAM_DMUX_CMD_CLOSE);
+ return 0;
+}
+
+static unsigned int needed_room(unsigned int avail, unsigned int needed)
+{
+ if (avail >= needed)
+ return 0;
+ return needed - avail;
+}
+
+static int bam_dmux_tx_prepare_skb(struct bam_dmux_netdev *bndev,
+ struct sk_buff *skb)
+{
+ unsigned int head = needed_room(skb_headroom(skb), BAM_DMUX_HDR_SIZE);
+ unsigned int pad = sizeof(u32) - skb->len % sizeof(u32);
+ unsigned int tail = needed_room(skb_tailroom(skb), pad);
+ struct bam_dmux_hdr *hdr;
+ int ret;
+
+ if (head || tail || skb_cloned(skb)) {
+ ret = pskb_expand_head(skb, head, tail, GFP_ATOMIC);
+ if (ret)
+ return ret;
+ }
+
+ hdr = skb_push(skb, sizeof(*hdr));
+ hdr->magic = BAM_DMUX_HDR_MAGIC;
+ hdr->signal = 0;
+ hdr->cmd = BAM_DMUX_CMD_DATA;
+ hdr->pad = pad;
+ hdr->ch = bndev->ch;
+ hdr->len = skb->len - sizeof(*hdr);
+ if (pad)
+ skb_put_zero(skb, pad);
+
+ return 0;
+}
+
+static netdev_tx_t bam_dmux_netdev_start_xmit(struct sk_buff *skb,
+ struct net_device *netdev)
+{
+ struct bam_dmux_netdev *bndev = netdev_priv(netdev);
+ struct bam_dmux *dmux = bndev->dmux;
+ struct bam_dmux_skb_dma *skb_dma;
+ int active, ret;
+
+ skb_dma = bam_dmux_tx_queue(dmux, skb);
+ if (!skb_dma)
+ return NETDEV_TX_BUSY;
+
+ active = pm_runtime_get(dmux->dev);
+ if (active < 0 && active != -EINPROGRESS)
+ goto drop;
+
+ ret = bam_dmux_tx_prepare_skb(bndev, skb);
+ if (ret)
+ goto drop;
+
+ if (!bam_dmux_skb_dma_map(skb_dma, DMA_TO_DEVICE))
+ goto drop;
+
+ if (active <= 0) {
+ /* Cannot sleep here so mark skb for wakeup handler and return */
+ if (!atomic_long_fetch_or(BIT(skb_dma - dmux->tx_skbs),
+ &dmux->tx_deferred_skb))
+ queue_pm_work(&dmux->tx_wakeup_work);
+ return NETDEV_TX_OK;
+ }
+
+ if (!bam_dmux_skb_dma_submit_tx(skb_dma))
+ goto drop;
+
+ dma_async_issue_pending(dmux->tx);
+ return NETDEV_TX_OK;
+
+drop:
+ bam_dmux_tx_done(skb_dma);
+ dev_kfree_skb_any(skb);
+ return NETDEV_TX_OK;
+}
+
+static void bam_dmux_tx_wakeup_work(struct work_struct *work)
+{
+ struct bam_dmux *dmux = container_of(work, struct bam_dmux, tx_wakeup_work);
+ unsigned long pending;
+ int ret, i;
+
+ ret = pm_runtime_resume_and_get(dmux->dev);
+ if (ret < 0) {
+ dev_err(dmux->dev, "Failed to resume: %d\n", ret);
+ return;
+ }
+
+ pending = atomic_long_xchg(&dmux->tx_deferred_skb, 0);
+ if (!pending)
+ goto out;
+
+ dev_dbg(dmux->dev, "pending skbs after wakeup: %#lx\n", pending);
+ for_each_set_bit(i, &pending, BAM_DMUX_NUM_SKB) {
+ bam_dmux_skb_dma_submit_tx(&dmux->tx_skbs[i]);
+ }
+ dma_async_issue_pending(dmux->tx);
+
+out:
+ pm_runtime_mark_last_busy(dmux->dev);
+ pm_runtime_put_autosuspend(dmux->dev);
+}
+
+static const struct net_device_ops bam_dmux_ops = {
+ .ndo_open = bam_dmux_netdev_open,
+ .ndo_stop = bam_dmux_netdev_stop,
+ .ndo_start_xmit = bam_dmux_netdev_start_xmit,
+};
+
+static const struct device_type wwan_type = {
+ .name = "wwan",
+};
+
+static void bam_dmux_netdev_setup(struct net_device *dev)
+{
+ dev->netdev_ops = &bam_dmux_ops;
+
+ dev->type = ARPHRD_RAWIP;
+ SET_NETDEV_DEVTYPE(dev, &wwan_type);
+ dev->flags = IFF_POINTOPOINT | IFF_NOARP;
+
+ dev->mtu = ETH_DATA_LEN;
+ dev->max_mtu = BAM_DMUX_MAX_DATA_SIZE;
+ dev->needed_headroom = sizeof(struct bam_dmux_hdr);
+ dev->needed_tailroom = sizeof(u32); /* word-aligned */
+ dev->tx_queue_len = DEFAULT_TX_QUEUE_LEN;
+
+ /* This perm addr will be used as interface identifier by IPv6 */
+ dev->addr_assign_type = NET_ADDR_RANDOM;
+ eth_random_addr(dev->perm_addr);
+}
+
+static void bam_dmux_register_netdev_work(struct work_struct *work)
+{
+ struct bam_dmux *dmux = container_of(work, struct bam_dmux, register_netdev_work);
+ struct bam_dmux_netdev *bndev;
+ struct net_device *netdev;
+ int ch, ret;
+
+ for_each_set_bit(ch, dmux->remote_channels, BAM_DMUX_NUM_CH) {
+ if (dmux->netdevs[ch])
+ continue;
+
+ netdev = alloc_netdev(sizeof(*bndev), "rmnet%d", NET_NAME_ENUM,
+ bam_dmux_netdev_setup);
+ if (!netdev)
+ return;
+
+ SET_NETDEV_DEV(netdev, dmux->dev);
+ netdev->dev_port = ch;
+
+ bndev = netdev_priv(netdev);
+ bndev->dmux = dmux;
+ bndev->ch = ch;
+
+ ret = register_netdev(netdev);
+ if (ret) {
+ dev_err(dmux->dev, "Failed to register netdev for channel %u: %d\n",
+ ch, ret);
+ free_netdev(netdev);
+ return;
+ }
+
+ dmux->netdevs[ch] = netdev;
+ }
+}
+
+static void bam_dmux_rx_callback(void *data);
+
+static bool bam_dmux_skb_dma_submit_rx(struct bam_dmux_skb_dma *skb_dma)
+{
+ struct bam_dmux *dmux = skb_dma->dmux;
+ struct dma_async_tx_descriptor *desc;
+
+ desc = dmaengine_prep_slave_single(dmux->rx, skb_dma->addr,
+ skb_dma->skb->len, DMA_DEV_TO_MEM,
+ DMA_PREP_INTERRUPT);
+ if (!desc) {
+ dev_err(dmux->dev, "Failed to prepare RX DMA buffer\n");
+ return false;
+ }
+
+ desc->callback = bam_dmux_rx_callback;
+ desc->callback_param = skb_dma;
+ desc->cookie = dmaengine_submit(desc);
+ return true;
+}
+
+static bool bam_dmux_skb_dma_queue_rx(struct bam_dmux_skb_dma *skb_dma, gfp_t gfp)
+{
+ if (!skb_dma->skb) {
+ skb_dma->skb = __netdev_alloc_skb(NULL, BAM_DMUX_BUFFER_SIZE, gfp);
+ if (!skb_dma->skb)
+ return false;
+ skb_put(skb_dma->skb, BAM_DMUX_BUFFER_SIZE);
+ }
+
+ return bam_dmux_skb_dma_map(skb_dma, DMA_FROM_DEVICE) &&
+ bam_dmux_skb_dma_submit_rx(skb_dma);
+}
+
+static void bam_dmux_cmd_data(struct bam_dmux_skb_dma *skb_dma)
+{
+ struct bam_dmux *dmux = skb_dma->dmux;
+ struct sk_buff *skb = skb_dma->skb;
+ struct bam_dmux_hdr *hdr = (struct bam_dmux_hdr *)skb->data;
+ struct net_device *netdev = dmux->netdevs[hdr->ch];
+
+ if (!netdev || !netif_running(netdev)) {
+ dev_warn(dmux->dev, "Data for inactive channel %u\n", hdr->ch);
+ return;
+ }
+
+ if (hdr->len > BAM_DMUX_MAX_DATA_SIZE) {
+ dev_err(dmux->dev, "Data larger than buffer? (%u > %u)\n",
+ hdr->len, (u16)BAM_DMUX_MAX_DATA_SIZE);
+ return;
+ }
+
+ skb_dma->skb = NULL; /* Hand over to network stack */
+
+ skb_pull(skb, sizeof(*hdr));
+ skb_trim(skb, hdr->len);
+ skb->dev = netdev;
+
+ /* Only Raw-IP/QMAP is supported by this driver */
+ switch (skb->data[0] & 0xf0) {
+ case 0x40:
+ skb->protocol = htons(ETH_P_IP);
+ break;
+ case 0x60:
+ skb->protocol = htons(ETH_P_IPV6);
+ break;
+ default:
+ skb->protocol = htons(ETH_P_MAP);
+ break;
+ }
+
+ netif_receive_skb(skb);
+}
+
+static void bam_dmux_cmd_open(struct bam_dmux *dmux, struct bam_dmux_hdr *hdr)
+{
+ struct net_device *netdev = dmux->netdevs[hdr->ch];
+
+ dev_dbg(dmux->dev, "open channel: %u\n", hdr->ch);
+
+ if (__test_and_set_bit(hdr->ch, dmux->remote_channels)) {
+ dev_warn(dmux->dev, "Channel already open: %u\n", hdr->ch);
+ return;
+ }
+
+ if (netdev) {
+ netif_device_attach(netdev);
+ } else {
+ /* Cannot sleep here, schedule work to register the netdev */
+ schedule_work(&dmux->register_netdev_work);
+ }
+}
+
+static void bam_dmux_cmd_close(struct bam_dmux *dmux, struct bam_dmux_hdr *hdr)
+{
+ struct net_device *netdev = dmux->netdevs[hdr->ch];
+
+ dev_dbg(dmux->dev, "close channel: %u\n", hdr->ch);
+
+ if (!__test_and_clear_bit(hdr->ch, dmux->remote_channels)) {
+ dev_err(dmux->dev, "Channel not open: %u\n", hdr->ch);
+ return;
+ }
+
+ if (netdev)
+ netif_device_detach(netdev);
+}
+
+static void bam_dmux_rx_callback(void *data)
+{
+ struct bam_dmux_skb_dma *skb_dma = data;
+ struct bam_dmux *dmux = skb_dma->dmux;
+ struct sk_buff *skb = skb_dma->skb;
+ struct bam_dmux_hdr *hdr = (struct bam_dmux_hdr *)skb->data;
+
+ bam_dmux_skb_dma_unmap(skb_dma, DMA_FROM_DEVICE);
+
+ if (hdr->magic != BAM_DMUX_HDR_MAGIC) {
+ dev_err(dmux->dev, "Invalid magic in header: %#x\n", hdr->magic);
+ goto out;
+ }
+
+ if (hdr->ch >= BAM_DMUX_NUM_CH) {
+ dev_dbg(dmux->dev, "Unsupported channel: %u\n", hdr->ch);
+ goto out;
+ }
+
+ switch (hdr->cmd) {
+ case BAM_DMUX_CMD_DATA:
+ bam_dmux_cmd_data(skb_dma);
+ break;
+ case BAM_DMUX_CMD_OPEN:
+ bam_dmux_cmd_open(dmux, hdr);
+ break;
+ case BAM_DMUX_CMD_CLOSE:
+ bam_dmux_cmd_close(dmux, hdr);
+ break;
+ default:
+ dev_err(dmux->dev, "Unsupported command %u on channel %u\n",
+ hdr->cmd, hdr->ch);
+ break;
+ }
+
+out:
+ if (bam_dmux_skb_dma_queue_rx(skb_dma, GFP_ATOMIC))
+ dma_async_issue_pending(dmux->rx);
+}
+
+static bool bam_dmux_power_on(struct bam_dmux *dmux)
+{
+ struct device *dev = dmux->dev;
+ struct dma_slave_config dma_rx_conf = {
+ .direction = DMA_DEV_TO_MEM,
+ .src_maxburst = BAM_DMUX_BUFFER_SIZE,
+ };
+ int i;
+
+ dmux->rx = dma_request_chan(dev, "rx");
+ if (IS_ERR(dmux->rx)) {
+ dev_err(dev, "Failed to request RX DMA channel: %pe\n", dmux->rx);
+ dmux->rx = NULL;
+ return false;
+ }
+ dmaengine_slave_config(dmux->rx, &dma_rx_conf);
+
+ for (i = 0; i < BAM_DMUX_NUM_SKB; i++) {
+ if (!bam_dmux_skb_dma_queue_rx(&dmux->rx_skbs[i], GFP_KERNEL))
+ return false;
+ }
+ dma_async_issue_pending(dmux->rx);
+
+ return true;
+}
+
+static void bam_dmux_free_skbs(struct bam_dmux_skb_dma skbs[],
+ enum dma_data_direction dir)
+{
+ int i;
+
+ for (i = 0; i < BAM_DMUX_NUM_SKB; i++) {
+ struct bam_dmux_skb_dma *skb_dma = &skbs[i];
+
+ if (skb_dma->addr)
+ bam_dmux_skb_dma_unmap(skb_dma, dir);
+ if (skb_dma->skb) {
+ dev_kfree_skb(skb_dma->skb);
+ skb_dma->skb = NULL;
+ }
+ }
+}
+
+static void bam_dmux_power_off(struct bam_dmux *dmux)
+{
+ if (dmux->tx) {
+ dmaengine_terminate_sync(dmux->tx);
+ dma_release_channel(dmux->tx);
+ dmux->tx = NULL;
+ }
+
+ if (dmux->rx) {
+ dmaengine_terminate_sync(dmux->rx);
+ dma_release_channel(dmux->rx);
+ dmux->rx = NULL;
+ }
+
+ bam_dmux_free_skbs(dmux->rx_skbs, DMA_FROM_DEVICE);
+}
+
+static irqreturn_t bam_dmux_pc_irq(int irq, void *data)
+{
+ struct bam_dmux *dmux = data;
+ bool new_state = !dmux->pc_state;
+
+ dev_dbg(dmux->dev, "pc: %u\n", new_state);
+
+ if (new_state) {
+ if (bam_dmux_power_on(dmux))
+ bam_dmux_pc_ack(dmux);
+ else
+ bam_dmux_power_off(dmux);
+ } else {
+ bam_dmux_power_off(dmux);
+ bam_dmux_pc_ack(dmux);
+ }
+
+ dmux->pc_state = new_state;
+ wake_up_all(&dmux->pc_wait);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t bam_dmux_pc_ack_irq(int irq, void *data)
+{
+ struct bam_dmux *dmux = data;
+
+ dev_dbg(dmux->dev, "pc ack\n");
+ complete_all(&dmux->pc_ack_completion);
+
+ return IRQ_HANDLED;
+}
+
+static int bam_dmux_runtime_suspend(struct device *dev)
+{
+ struct bam_dmux *dmux = dev_get_drvdata(dev);
+
+ dev_dbg(dev, "runtime suspend\n");
+ bam_dmux_pc_vote(dmux, false);
+
+ return 0;
+}
+
+static int __maybe_unused bam_dmux_runtime_resume(struct device *dev)
+{
+ struct bam_dmux *dmux = dev_get_drvdata(dev);
+
+ dev_dbg(dev, "runtime resume\n");
+
+ /* Wait until previous power down was acked */
+ if (!wait_for_completion_timeout(&dmux->pc_ack_completion,
+ BAM_DMUX_REMOTE_TIMEOUT))
+ return -ETIMEDOUT;
+
+ /* Vote for power state */
+ bam_dmux_pc_vote(dmux, true);
+
+ /* Wait for ack */
+ if (!wait_for_completion_timeout(&dmux->pc_ack_completion,
+ BAM_DMUX_REMOTE_TIMEOUT)) {
+ bam_dmux_pc_vote(dmux, false);
+ return -ETIMEDOUT;
+ }
+
+ /* Wait until we're up */
+ if (!wait_event_timeout(dmux->pc_wait, dmux->pc_state,
+ BAM_DMUX_REMOTE_TIMEOUT)) {
+ bam_dmux_pc_vote(dmux, false);
+ return -ETIMEDOUT;
+ }
+
+ /* Ensure that we actually initialized successfully */
+ if (!dmux->rx) {
+ bam_dmux_pc_vote(dmux, false);
+ return -ENXIO;
+ }
+
+ /* Request TX channel if necessary */
+ if (dmux->tx)
+ return 0;
+
+ dmux->tx = dma_request_chan(dev, "tx");
+ if (IS_ERR(dmux->rx)) {
+ dev_err(dev, "Failed to request TX DMA channel: %pe\n", dmux->tx);
+ dmux->tx = NULL;
+ bam_dmux_runtime_suspend(dev);
+ return -ENXIO;
+ }
+
+ return 0;
+}
+
+static int bam_dmux_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct bam_dmux *dmux;
+ int ret, pc_ack_irq, i;
+ unsigned int bit;
+
+ dmux = devm_kzalloc(dev, sizeof(*dmux), GFP_KERNEL);
+ if (!dmux)
+ return -ENOMEM;
+
+ dmux->dev = dev;
+ platform_set_drvdata(pdev, dmux);
+
+ dmux->pc_irq = platform_get_irq_byname(pdev, "pc");
+ if (dmux->pc_irq < 0)
+ return dmux->pc_irq;
+
+ pc_ack_irq = platform_get_irq_byname(pdev, "pc-ack");
+ if (pc_ack_irq < 0)
+ return pc_ack_irq;
+
+ dmux->pc = devm_qcom_smem_state_get(dev, "pc", &bit);
+ if (IS_ERR(dmux->pc))
+ return dev_err_probe(dev, PTR_ERR(dmux->pc),
+ "Failed to get pc state\n");
+ dmux->pc_mask = BIT(bit);
+
+ dmux->pc_ack = devm_qcom_smem_state_get(dev, "pc-ack", &bit);
+ if (IS_ERR(dmux->pc_ack))
+ return dev_err_probe(dev, PTR_ERR(dmux->pc_ack),
+ "Failed to get pc-ack state\n");
+ dmux->pc_ack_mask = BIT(bit);
+
+ init_waitqueue_head(&dmux->pc_wait);
+ init_completion(&dmux->pc_ack_completion);
+ complete_all(&dmux->pc_ack_completion);
+
+ spin_lock_init(&dmux->tx_lock);
+ INIT_WORK(&dmux->tx_wakeup_work, bam_dmux_tx_wakeup_work);
+ INIT_WORK(&dmux->register_netdev_work, bam_dmux_register_netdev_work);
+
+ for (i = 0; i < BAM_DMUX_NUM_SKB; i++) {
+ dmux->rx_skbs[i].dmux = dmux;
+ dmux->tx_skbs[i].dmux = dmux;
+ }
+
+ /* Runtime PM manages our own power vote.
+ * Note that the RX path may be active even if we are runtime suspended,
+ * since it is controlled by the remote side.
+ */
+ pm_runtime_set_autosuspend_delay(dev, BAM_DMUX_AUTOSUSPEND_DELAY);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_enable(dev);
+
+ ret = devm_request_threaded_irq(dev, pc_ack_irq, NULL, bam_dmux_pc_ack_irq,
+ IRQF_ONESHOT, NULL, dmux);
+ if (ret)
+ return ret;
+
+ ret = devm_request_threaded_irq(dev, dmux->pc_irq, NULL, bam_dmux_pc_irq,
+ IRQF_ONESHOT, NULL, dmux);
+ if (ret)
+ return ret;
+
+ ret = irq_get_irqchip_state(dmux->pc_irq, IRQCHIP_STATE_LINE_LEVEL,
+ &dmux->pc_state);
+ if (ret)
+ return ret;
+
+ /* Check if remote finished initialization before us */
+ if (dmux->pc_state) {
+ if (bam_dmux_power_on(dmux))
+ bam_dmux_pc_ack(dmux);
+ else
+ bam_dmux_power_off(dmux);
+ }
+
+ return 0;
+}
+
+static int bam_dmux_remove(struct platform_device *pdev)
+{
+ struct bam_dmux *dmux = platform_get_drvdata(pdev);
+ struct device *dev = dmux->dev;
+ LIST_HEAD(list);
+ int i;
+
+ /* Unregister network interfaces */
+ cancel_work_sync(&dmux->register_netdev_work);
+ rtnl_lock();
+ for (i = 0; i < BAM_DMUX_NUM_CH; ++i)
+ if (dmux->netdevs[i])
+ unregister_netdevice_queue(dmux->netdevs[i], &list);
+ unregister_netdevice_many(&list);
+ rtnl_unlock();
+ cancel_work_sync(&dmux->tx_wakeup_work);
+
+ /* Drop our own power vote */
+ pm_runtime_disable(dev);
+ pm_runtime_dont_use_autosuspend(dev);
+ bam_dmux_runtime_suspend(dev);
+ pm_runtime_set_suspended(dev);
+
+ /* Try to wait for remote side to drop power vote */
+ if (!wait_event_timeout(dmux->pc_wait, !dmux->rx, BAM_DMUX_REMOTE_TIMEOUT))
+ dev_err(dev, "Timed out waiting for remote side to suspend\n");
+
+ /* Make sure everything is cleaned up before we return */
+ disable_irq(dmux->pc_irq);
+ bam_dmux_power_off(dmux);
+ bam_dmux_free_skbs(dmux->tx_skbs, DMA_TO_DEVICE);
+
+ return 0;
+}
+
+static const struct dev_pm_ops bam_dmux_pm_ops = {
+ SET_RUNTIME_PM_OPS(bam_dmux_runtime_suspend, bam_dmux_runtime_resume, NULL)
+};
+
+static const struct of_device_id bam_dmux_of_match[] = {
+ { .compatible = "qcom,bam-dmux" },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, bam_dmux_of_match);
+
+static struct platform_driver bam_dmux_driver = {
+ .probe = bam_dmux_probe,
+ .remove = bam_dmux_remove,
+ .driver = {
+ .name = "bam-dmux",
+ .pm = &bam_dmux_pm_ops,
+ .of_match_table = bam_dmux_of_match,
+ },
+};
+module_platform_driver(bam_dmux_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Qualcomm BAM-DMUX WWAN Network Driver");
+MODULE_AUTHOR("Stephan Gerhold <[email protected]>");
--
2.32.0

2021-07-19 15:04:34

by Stephan Gerhold

[permalink] [raw]

Subject: [RFC PATCH net-next 1/4] dt-bindings: dmaengine: bam_dma: Add remote power collapse mode

2021-07-19 15:04:41

by Stephan Gerhold

[permalink] [raw]

Subject: [RFC PATCH net-next 2/4] dmaengine: qcom: bam_dma: Add remote power collapse mode

In some configurations, the BAM DMA controller is set up by a remote
processor and the local processor can simply start making use of it
without setting up the BAM. This is already supported using the
"qcom,controlled-remotely" property.

However, for some reason another possible configuration is that the
remote processor is responsible for powering up the BAM, but we are
still responsible for initializing it (e.g. resetting it etc).

This configuration is quite challenging to handle properly because
the power control is handled through separate channels
(e.g. device-specific SMSM interrupts / smem-states). Great care
must be taken to ensure the BAM registers are not accessed while
the BAM is power-collapsed since this results in a bus stall.

Attempt to support this configuration with minimal device-specific
code in the bam_dma driver by tracking the number of requested
channels. Consumers of DMA channels are responsible to only request
DMA channels when the BAM was powered on by the remote processor,
and to release them before the BAM is power-collapsed.

When the first channel is requested the BAM is initialized (reset)
and it is also put into reset when the last channel was released.

Signed-off-by: Stephan Gerhold <[email protected]>
---
NOTE: This is *not* a compile-time requirement for the BAM-DMUX driver
so this could also go through the dmaengine tree.

I tried to come up with other solutions for this situation, but this
is the cleanest I came up with so far. The main advantage is that
it keeps the bam_dma driver generic and fairly simple. The main
disadvantage might be that there is some overhead when the DMA channels
are repeatedly requested and released (the BAM-DMUX driver uses
PM runtime to autosuspend after 1 second of inactivity).

Some alternative ideas (but I'm not sure how they would work exactly):

- Have some dmaengine_*() operation to make the bam_dma driver aware
that the BAM is power-collapsed (instead of requesting/releasing
the channels every time).

- Give the BAM-DMUX power control IRQ to bam_dma so it knows when
the BAM is power-collapsed or not.

- Perhaps even give the smem-state to bam_dma so it could request
powering on the BAM itself. This would be quite strange though,
since bam_dma already uses runtime PM but differently (only active
during register writes, not necessarily active during transfers).

Note however that the power control of BAM-DMUX also involves
queuing RX buffers so there would be much more coordination
needed between bam_dma and bam-dmux.

All in all, I think the solution in this patch is still the cleanest
approach so far.
---
drivers/dma/qcom/bam_dma.c | 88 ++++++++++++++++++++++++--------------
1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index c8a77b428b52..8bf6c50bda73 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -388,6 +388,8 @@ struct bam_device {
/* execution environment ID, from DT */
u32 ee;
bool controlled_remotely;
+ bool remote_power_collapse;
+ u32 active_channels;

const struct reg_offset_data *layout;

@@ -415,6 +417,44 @@ static inline void __iomem *bam_addr(struct bam_device *bdev, u32 pipe,
r.ee_mult * bdev->ee;
}

+/**
+ * bam_reset - reset and initialize BAM registers
+ * @bdev: bam device
+ */
+static void bam_reset(struct bam_device *bdev)
+{
+ u32 val;
+
+ /* s/w reset bam */
+ /* after reset all pipes are disabled and idle */
+ val = readl_relaxed(bam_addr(bdev, 0, BAM_CTRL));
+ val |= BAM_SW_RST;
+ writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
+ val &= ~BAM_SW_RST;
+ writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
+
+ /* make sure previous stores are visible before enabling BAM */
+ wmb();
+
+ /* enable bam */
+ val |= BAM_EN;
+ writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
+
+ /* set descriptor threshhold, start with 4 bytes */
+ writel_relaxed(DEFAULT_CNT_THRSHLD,
+ bam_addr(bdev, 0, BAM_DESC_CNT_TRSHLD));
+
+ /* Enable default set of h/w workarounds, ie all except BAM_FULL_PIPE */
+ writel_relaxed(BAM_CNFG_BITS_DEFAULT, bam_addr(bdev, 0, BAM_CNFG_BITS));
+
+ /* enable irqs for errors */
+ writel_relaxed(BAM_ERROR_EN | BAM_HRESP_ERR_EN,
+ bam_addr(bdev, 0, BAM_IRQ_EN));
+
+ /* unmask global bam interrupt */
+ writel_relaxed(BAM_IRQ_MSK, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
+}
+
/**
* bam_reset_channel - Reset individual BAM DMA channel
* @bchan: bam channel
@@ -512,6 +552,9 @@ static int bam_alloc_chan(struct dma_chan *chan)
return -ENOMEM;
}

+ if (bdev->active_channels++ == 0 && bdev->remote_power_collapse)
+ bam_reset(bdev);
+
return 0;
}

@@ -565,6 +608,13 @@ static void bam_free_chan(struct dma_chan *chan)
/* disable irq */
writel_relaxed(0, bam_addr(bdev, bchan->id, BAM_P_IRQ_EN));

+ if (--bdev->active_channels == 0 && bdev->remote_power_collapse) {
+ /* s/w reset bam */
+ val = readl_relaxed(bam_addr(bdev, 0, BAM_CTRL));
+ val |= BAM_SW_RST;
+ writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
+ }
+
err:
pm_runtime_mark_last_busy(bdev->dev);
pm_runtime_put_autosuspend(bdev->dev);
@@ -1164,38 +1214,10 @@ static int bam_init(struct bam_device *bdev)
bdev->num_channels = val & BAM_NUM_PIPES_MASK;
}

- if (bdev->controlled_remotely)
+ if (bdev->controlled_remotely || bdev->remote_power_collapse)
return 0;

- /* s/w reset bam */
- /* after reset all pipes are disabled and idle */
- val = readl_relaxed(bam_addr(bdev, 0, BAM_CTRL));
- val |= BAM_SW_RST;
- writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
- val &= ~BAM_SW_RST;
- writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
-
- /* make sure previous stores are visible before enabling BAM */
- wmb();
-
- /* enable bam */
- val |= BAM_EN;
- writel_relaxed(val, bam_addr(bdev, 0, BAM_CTRL));
-
- /* set descriptor threshhold, start with 4 bytes */
- writel_relaxed(DEFAULT_CNT_THRSHLD,
- bam_addr(bdev, 0, BAM_DESC_CNT_TRSHLD));
-
- /* Enable default set of h/w workarounds, ie all except BAM_FULL_PIPE */
- writel_relaxed(BAM_CNFG_BITS_DEFAULT, bam_addr(bdev, 0, BAM_CNFG_BITS));
-
- /* enable irqs for errors */
- writel_relaxed(BAM_ERROR_EN | BAM_HRESP_ERR_EN,
- bam_addr(bdev, 0, BAM_IRQ_EN));
-
- /* unmask global bam interrupt */
- writel_relaxed(BAM_IRQ_MSK, bam_addr(bdev, 0, BAM_IRQ_SRCS_MSK_EE));
-
+ bam_reset(bdev);
return 0;
}

@@ -1257,8 +1279,10 @@ static int bam_dma_probe(struct platform_device *pdev)

bdev->controlled_remotely = of_property_read_bool(pdev->dev.of_node,
"qcom,controlled-remotely");
+ bdev->remote_power_collapse = of_property_read_bool(pdev->dev.of_node,
+ "qcom,remote-power-collapse");

- if (bdev->controlled_remotely) {
+ if (bdev->controlled_remotely || bdev->remote_power_collapse) {
ret = of_property_read_u32(pdev->dev.of_node, "num-channels",
&bdev->num_channels);
if (ret)
@@ -1270,7 +1294,7 @@ static int bam_dma_probe(struct platform_device *pdev)
dev_err(bdev->dev, "num-ees unspecified in dt\n");
}

- if (bdev->controlled_remotely)
+ if (bdev->controlled_remotely || bdev->remote_power_collapse)
bdev->bamclk = devm_clk_get_optional(bdev->dev, "bam_clk");
else
bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
--
2.32.0

2021-07-19 15:07:12

by Stephan Gerhold

[permalink] [raw]

Subject: [RFC PATCH net-next 3/4] dt-bindings: net: Add schema for Qualcomm BAM-DMUX

2021-07-19 17:37:08

by Jeffrey Hugo

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On Mon, Jul 19, 2021 at 9:01 AM Stephan Gerhold <[email protected]> wrote:
>
> The BAM Data Multiplexer provides access to the network data channels
> of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
> or MSM8974. This series adds a driver that allows using it.
>
> For more information about BAM-DMUX, see PATCH 4/4.
>
> Shortly said, BAM-DMUX is built using a simple protocol layer on top of
> a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
> a quite strange mode that I call "remote power collapse", where the
> modem/remote side is responsible for powering on the BAM when needed but we
> are responsible to initialize it. The BAM is power-collapsed when unneeded
> by coordinating power control via bidirectional interrupts from the
> BAM-DMUX driver.

The hardware is physically located on the modem, and tied to the modem
regulators, etc. The modem has the ultimate "off" switch. However,
due to the BAM architecture (which is complicated), configuration uses
cooperation on both ends.

>
> The series first adds one possible solution for handling this "remote power
> collapse" mode in the bam_dma driver, then it adds the BAM-DMUX driver to
> the WWAN subsystem. Note that the BAM-DMUX driver does not actually make
> use of the WWAN subsystem yet, since I'm not sure how to fit it in there
> yet (see PATCH 4/4).
>
> Please note that all of the changes in this patch series are based on
> a fairly complicated driver from Qualcomm [1].
> I do not have access to any documentation about "BAM-DMUX". :(

I'm pretty sure I still have the internal docs.

Are there specific things you want to know?

2021-07-19 17:49:26

by Loic Poulain

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hi Stephan,

On Mon, 19 Jul 2021 at 17:01, Stephan Gerhold <[email protected]> wrote:
>
> I'm not sure how to integrate the driver with the WWAN subsystem yet.
> At the moment the driver creates network interfaces for all channels
> announced by the modem, it does not make use of the WWAN link management
> yet. Unfortunately, this is a bit complicated:
>
> Both QMAP and the built-in multiplexing layer might be needed at some point.
> There are firmware versions that do not support QMAP and the other way around
> (the built-in multiplexing was disabled on very recent firmware versions).
> Only userspace can check if QMAP is supported in the firmware (via QMI).
>
> I could ignore QMAP completely for now but I think someone will show up
> who will need this eventually. And if there is going to be common code for
> QMAP/rmnet link management it would be nice if BAM-DMUX could also make
> use of it.

I have this on my TODO list for mhi-net QMAP.

> But the question is, how could this look like? How do we know if we should
> create a link for QMAP or a BAM-DMUX channel? Does it even make sense
> to manage the 1-8 channels via the WWAN link management?

Couldn't it be specified via dts (property or different compatible
string)? would it make sense to have two drivers (with common core) to
manage either the multi-bam channel or newer QMAP based single
bam-channel modems.

>
> Another problem is that the WWAN subsystem currently creates all network
> interfaces below the common WWAN device. This means that userspace like
> ModemManager has no way to check which driver provides them. This is
> necessary though to decide how to set it up via QMI (ModemManager uses it).

Well, I have quite a similar concern since I'm currently porting
mhi-net mbim to wwan framework, and I was thinking about not making
wwan device parent of the network link/netdev (in the same way as
wlan0 is not child of ieee80211 device), but not sure if it's a good
idea or not since we can not really consider driver name part of the
uapi.

The way links are created is normally abstracted, so if you know which
bam variant you have from wwan network driver side (e.g. via dts), you
should have nothing to check on the user side, except the session id.

Regards,
Loic

2021-07-20 00:29:50

by Jeffrey Hugo

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On 7/19/2021 12:23 PM, Stephan Gerhold wrote:
> On Mon, Jul 19, 2021 at 09:43:27AM -0600, Jeffrey Hugo wrote:
>> On Mon, Jul 19, 2021 at 9:01 AM Stephan Gerhold <[email protected]> wrote:
>>>
>>> The BAM Data Multiplexer provides access to the network data channels
>>> of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
>>> or MSM8974. This series adds a driver that allows using it.
>>>
>>> For more information about BAM-DMUX, see PATCH 4/4.
>>>
>>> Shortly said, BAM-DMUX is built using a simple protocol layer on top of
>>> a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
>>> a quite strange mode that I call "remote power collapse", where the
>>> modem/remote side is responsible for powering on the BAM when needed but we
>>> are responsible to initialize it. The BAM is power-collapsed when unneeded
>>> by coordinating power control via bidirectional interrupts from the
>>> BAM-DMUX driver.
>>
>> The hardware is physically located on the modem, and tied to the modem
>> regulators, etc. The modem has the ultimate "off" switch. However,
>> due to the BAM architecture (which is complicated), configuration uses
>> cooperation on both ends.
>>
>
> What I find strange is that it wasn't done similarly to e.g. Slimbus
> which has a fairly similar setup. (I used that driver as inspiration for
> how to use the mainline qcom_bam driver instead of the "SPS" from
> downstream.)
>
> Slimbus uses qcom,controlled-remotely together with the LPASS
> remoteproc, so it looks like there LPASS does both power-collapse
> and initialization of the BAM. Whereas here the modem does the
> power-collapse but we're supposed to do the initialization.

I suspect I don't have a satisfactory answer for you. The teams that
did slimbus were not the teams involved in the bam_dmux, and the two
didn't talk to each-other. The bam_dmux side wasn't aware of the
slimbus situation, at the time. I don't know if the slimbus folks knew
about bam_dmux. If you have two silos working independently, its
unlikely they will create exactly the same solution.

>
>>>
>>> The series first adds one possible solution for handling this "remote power
>>> collapse" mode in the bam_dma driver, then it adds the BAM-DMUX driver to
>>> the WWAN subsystem. Note that the BAM-DMUX driver does not actually make
>>> use of the WWAN subsystem yet, since I'm not sure how to fit it in there
>>> yet (see PATCH 4/4).
>>>
>>> Please note that all of the changes in this patch series are based on
>>> a fairly complicated driver from Qualcomm [1].
>>> I do not have access to any documentation about "BAM-DMUX". :(
>>
>> I'm pretty sure I still have the internal docs.
>>
>> Are there specific things you want to know?
>
> Oh, thanks a lot for asking! I mainly mentioned this here to avoid
> in-depth questions about the hardware (since I can't answer those).
>
> I can probably think of many, many questions, but I'll try to limit
> myself to the two I'm most confused about. :-)
>
>
> It's somewhat unrelated to this initial patch set since I'm not using
> QMAP at the moment, but I'm quite confused about the "MTU negotiation
> feature" that you added support for in [1]. (I *think* that is you,
> right?) :)

Yes. Do I owe you for some brain damage? :)

>
> The part that I somewhat understand is the "signal" sent in the "OPEN"
> command from the modem. It tells us the maximum buffer size the modem
> is willing to accept for TX packets ("ul_mtu" in that commit).
>
> Similarly, if we send "OPEN" to the modem we make the modem aware
> of our maximum RX buffer size plus the number of RX buffers.
> (create_open_signal() function).
>
> The part that is confusing me is the way the "dynamic MTU" is
> enabled/disabled based on the "signal" in "DATA" commands as well.
> (process_dynamic_mtu() function). When would that happen? The code
> suggests that the modem might just suddenly announce that the large
> MTU should be used from now on. But the "buffer_size" is only changed
> for newly queued RX buffers so I'm not even sure how the modem knows
> that it can now send more data at once.
>
> Any chance you could clarify how this should work exactly?

So, I think some of this might make more sense after my response to
question #2.

I don't know how much of this translates to modern platforms. I don't
really work on MSMs anymore, but I can convey what I recall and how
things were "back then"

So, essentially the change you are looking at is the bam_dmux portion of
an overall feature for improving the performance of what was known as
"tethered rmnet".

Per my understanding (which the documentation of this feature
reinforces), teathered rmnet was chiefly a test feature. Your "data"
(websites, email, etc) could be consumed by the device itself, or
exported off, if you teathered your phone to a laptop so that the laptop
could use the phone's data connection. There ends up being 3
implementations for this.

Consuming the data on the phone would route it to the IP stack via the
rmnet driver.

Consuming the data on an external device could take one of 2 routes.

Android would use the "native" routing of the Linux IP stack to
essentially NAT the laptop. The data would go to the rmnet driver, to
the IP stack, and the IP stack would route it to USB.

The other route is that the data could be routed directly to USB. This
is "teathered rmnet". In the case of bam_dmux platforms, the USB stack
is a client of bam_dmux.

Teathered rmnet was never an end-user usecase. It was essentially a
validation feature for both internal testing, and also qualifying the
device with the carriers. The carriers knew that Android teathering
involved NAT based routing on the phone, and wanted to figure out if the
phone could meet the raw performance specs of the RF technology (LTE
Category 4 in this case) in a tethered scenario, without the routing.

For tethered rmnet, USB (at the time) was having issues consistently
meeting those data rates (50mbps UL, 100mbps DL concurrently, if I
recall correctly). So, the decided solution was to implement QMAP
aggregation.

A QMAP "call" over tethered rmnet would be negotiated between the app on
the PC, and "dataservices" or "DS" on the modem. One of the initial
steps of that negotitation causes DS to tell A2 software that QMAP over
tethered rmnet is being activated. That would trigger A2 to activate
the process_dynamic_mtu() code path. Now bam_dmux would allocate future
RX buffers of the increased size which could handle the aggregated
packets. I think the part that is confusing you is, what about the
already queued buffers that are of the old size? Well, essentially
those get consumed by the rest of the QMAP call negotiation, so by the
time actual aggregated data is going to be sent from Modem to bam_dmux,
the pool has been consumed and refilled.

When the tethered rmnet connection is "brought down", DS notifies A2,
and A2 stops requesting the larger buffers.

Since this not something an end user should ever exercise, you may want
to consider dropping it.

> And a second question if you don't mind: What kind of hardware block
> am I actually talking to here? I say "modem" above but I just know about
> the BAM and the DMUX protocol layer. I have also seen assertion failures
> of the modem DSP firmware if I implement something incorrectly.
>
> Is the DMUX protocol just some firmware concept or actually something
> understood by some hardware block? I've also often seen mentions of some
> "A2" hardware block but I have no idea what that actually is. What's
> even worse, in a really old kernel A2/BAM-DMUX also appears as part of
> the IPA driver [2], and I thought IPA is the new thing after BAM-DMUX...

A2 predates IPA. IPA is essentially an evolution of A2.

Sit down son, let me tell you the history of the world :)

A long time ago, there was only a single processor that did both the
"modem" and the "apps". We generally would call these the 6K days as
that was the number of the chips (6XXX). Then it was decided that the
roles of Apps and Modem should be separated into two different cores.
The modem, handling more "real time" things, and apps, being more
"general purpose". This started with the 7K series.

However, this created a problem as data from a data call may need to be
consumed by the modem, or the apps, and it wouldn't be clear until the
packet headers were inspected, where the packet needed to be routed to.
Sometimes this was handled on apps, sometimes on modem. Usually via a
fully featured IP stack.

With LTE, software couldn't really keep up, and so a hardware engine to
parse the fields and route the package based on programmed filters was
implemented. This is the "Algorithm Accelerator", aka AA, aka A2.

The A2 first appeared on the 9600 chip, which was originally intended
for Gobi- those dongles you could plug into your laptop to give it a
data connection on the go when there was no wifi. It was then coupled
with both 7x30 and 8660 in what we would call "fusion" to create the
first LTE capable phones (HTC thunderbolt is the product I recall) until
an integrated solution could come along.

That integrated solution was 8960.

Back to the fusion solution for a second, the 9600 was connected to the
7x30/8660 via SDIO. Prior to this, the data call control and data path
was all in chip via SMD. Each rmnet instance had its own SMD channel,
so essentially its own physical pipe. With SDIO and 9600, there were
not enough lanes, so we invented SDIO_CMUX and SDIO_DMUX - the Control
and Data multiplexers over SDIO.

With 8960, everything was integrated again, so we could run the control
path over SMD and didn't need a mux. However, the A2 moved from the
9600 modem to the 8960 integrated modem, and now we had a direct
connection to its BAM. Again, the BAM had a limited number of physical
pipes, so we needed a data multiplexer again. Thus SDIO_DMUX evolved
into BAM_DMUX.

The A2 is a hardware block with an attached BAM, that "hangs off" the
modem. There is a software component that also runs on the modem, but
in general is limited to configuration. Processing of data is expected
to be all in hardware. As I think I mentioned, the A2 is a hardware
engine that routes IP packets based on programmed filters.

BAM instances (as part of the smart peripheral subsystem or SPS) can
either be out in the system, or attached to a peripheral. The A2 BAM is
attached to the A2 peripheral. BAM instances can run in one of 3 modes
- BAM-to-BAM, BAM-to-System, or System-to-System. BAM-to-BAM is two BAM
instances talking to eachother. If the USB controller has a BAM, and
the A2 has a BAM, those two BAMS could talk directly to copy data
between the A2 and USB hardware blocks without software interaction
(after some configuration). "System" means system memory, or DDR.
Bam-to-System is the mode the A2 BAM runs in where it takes data to/from
DDR and gives/takes that data with the A2. System-to-System would be
used by a BAM instance not associated with any peripheral to transfer
data say from Apps DDR to Modem DDR.

The A2 can get data from the RF interface, and determine if that needs
to go to some modem consumer, the apps processor, or on some chips to
the wifi processor. All in hardware, much faster than software for
multiple reasons, but mainly because multiple filters can be evaluated
in parallel, each filter looking at multiple fields in parallel. In a
nutshell, the IPA is a revised A2 that is not associated with any
processor (like the modem), which allows it to route data better (think
wifi and audio usecases).

Hope that all helps. I'm "around" for more questions.

>
> Not sure how much you can reveal about this. :)
>
> Thanks a lot!
> Stephan
>
> [1]: https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?h=LA.BR.1.2.9.1-02310-8x16.0&id=c7001b82388129ee02ac9ae1a1ef9993eafbcb26
> [2]: https://source.codeaurora.org/quic/la/kernel/msm/tree/drivers/platform/msm/ipa/a2_service.c?h=LA.BF.1.1.3-01610-8x74.0
>

2021-07-20 02:05:32

by Stephan Gerhold

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On Mon, Jul 19, 2021 at 09:43:27AM -0600, Jeffrey Hugo wrote:
> On Mon, Jul 19, 2021 at 9:01 AM Stephan Gerhold <[email protected]> wrote:
> >
> > The BAM Data Multiplexer provides access to the network data channels
> > of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
> > or MSM8974. This series adds a driver that allows using it.
> >
> > For more information about BAM-DMUX, see PATCH 4/4.
> >
> > Shortly said, BAM-DMUX is built using a simple protocol layer on top of
> > a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
> > a quite strange mode that I call "remote power collapse", where the
> > modem/remote side is responsible for powering on the BAM when needed but we
> > are responsible to initialize it. The BAM is power-collapsed when unneeded
> > by coordinating power control via bidirectional interrupts from the
> > BAM-DMUX driver.
>
> The hardware is physically located on the modem, and tied to the modem
> regulators, etc. The modem has the ultimate "off" switch. However,
> due to the BAM architecture (which is complicated), configuration uses
> cooperation on both ends.
>

What I find strange is that it wasn't done similarly to e.g. Slimbus
which has a fairly similar setup. (I used that driver as inspiration for
how to use the mainline qcom_bam driver instead of the "SPS" from
downstream.)

Slimbus uses qcom,controlled-remotely together with the LPASS
remoteproc, so it looks like there LPASS does both power-collapse
and initialization of the BAM. Whereas here the modem does the
power-collapse but we're supposed to do the initialization.

> >
> > The series first adds one possible solution for handling this "remote power
> > collapse" mode in the bam_dma driver, then it adds the BAM-DMUX driver to
> > the WWAN subsystem. Note that the BAM-DMUX driver does not actually make
> > use of the WWAN subsystem yet, since I'm not sure how to fit it in there
> > yet (see PATCH 4/4).
> >
> > Please note that all of the changes in this patch series are based on
> > a fairly complicated driver from Qualcomm [1].
> > I do not have access to any documentation about "BAM-DMUX". :(
>
> I'm pretty sure I still have the internal docs.
>
> Are there specific things you want to know?

Oh, thanks a lot for asking! I mainly mentioned this here to avoid
in-depth questions about the hardware (since I can't answer those).

I can probably think of many, many questions, but I'll try to limit
myself to the two I'm most confused about. :-)

It's somewhat unrelated to this initial patch set since I'm not using
QMAP at the moment, but I'm quite confused about the "MTU negotiation
feature" that you added support for in [1]. (I *think* that is you,
right?) :)

The part that I somewhat understand is the "signal" sent in the "OPEN"
command from the modem. It tells us the maximum buffer size the modem
is willing to accept for TX packets ("ul_mtu" in that commit).

Similarly, if we send "OPEN" to the modem we make the modem aware
of our maximum RX buffer size plus the number of RX buffers.
(create_open_signal() function).

The part that is confusing me is the way the "dynamic MTU" is
enabled/disabled based on the "signal" in "DATA" commands as well.
(process_dynamic_mtu() function). When would that happen? The code
suggests that the modem might just suddenly announce that the large
MTU should be used from now on. But the "buffer_size" is only changed
for newly queued RX buffers so I'm not even sure how the modem knows
that it can now send more data at once.

Any chance you could clarify how this should work exactly?

And a second question if you don't mind: What kind of hardware block
am I actually talking to here? I say "modem" above but I just know about
the BAM and the DMUX protocol layer. I have also seen assertion failures
of the modem DSP firmware if I implement something incorrectly.

Is the DMUX protocol just some firmware concept or actually something
understood by some hardware block? I've also often seen mentions of some
"A2" hardware block but I have no idea what that actually is. What's
even worse, in a really old kernel A2/BAM-DMUX also appears as part of
the IPA driver [2], and I thought IPA is the new thing after BAM-DMUX...

Not sure how much you can reveal about this. :)

Thanks a lot!
Stephan

[1]: https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?h=LA.BR.1.2.9.1-02310-8x16.0&id=c7001b82388129ee02ac9ae1a1ef9993eafbcb26
[2]: https://source.codeaurora.org/quic/la/kernel/msm/tree/drivers/platform/msm/ipa/a2_service.c?h=LA.BF.1.1.3-01610-8x74.0

2021-07-20 09:13:52

by Sergey Ryazanov

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hello Stephan,

On Mon, Jul 19, 2021 at 6:01 PM Stephan Gerhold <[email protected]> wrote:
> The BAM Data Multiplexer provides access to the network data channels of
> modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916 or
> MSM8974. It is built using a simple protocol layer on top of a DMA engine
> (Qualcomm BAM) and bidirectional interrupts to coordinate power control.
>
> The modem announces a fixed set of channels by sending an OPEN command.
> The driver exports each channel as separate network interface so that
> a connection can be established via QMI from userspace. The network
> interface can work either in Ethernet or Raw-IP mode (configurable via
> QMI). However, Ethernet mode seems to be broken with most firmwares
> (network packets are actually received as Raw-IP), therefore the driver
> only supports Raw-IP mode.
>
> The driver uses runtime PM to coordinate power control with the modem.
> TX/RX buffers are put in a kind of "ring queue" and submitted via
> the bam_dma driver of the DMAEngine subsystem.
>
> The basic architecture looks roughly like this:
>
> +------------+ +-------+
> [IPv4/6] | BAM-DMUX | | |
> [Data...] | | | |
> ---------->|rmnet0 | [DMUX chan: x] | |
> [IPv4/6] | (chan: 0) | [IPv4/6] | |
> [Data...] | | [Data...] | |
> ---------->|rmnet1 |--------------->| Modem |
> | (chan: 1) | BAM | |
> [IPv4/6] | ... | (DMA Engine) | |
> [Data...] | | | |
> ---------->|rmnet7 | | |
> | (chan: 7) | | |
> +------------+ +-------+
>
> However, on newer SoCs/firmware versions Qualcomm began gradually moving
> to QMAP (rmnet driver) as backend-independent protocol for multiplexing
> and data aggegration. Some firmware versions allow using QMAP on top of
> BAM-DMUX (effectively resulting in a second multiplexing layer plus data
> aggregation). The architecture with QMAP would look roughly like this:
>
> +-------------+ +------------+ +-------+
> [IPv4/6] | RMNET | | BAM-DMUX | | |
> [Data...] | | | | [DMUX chan: 0] | |
> ---------->|rmnet_data1 | ----->|rmnet0 | [QMAP mux-id: x] | |
> | (mux-id: 1) | | | (chan: 0) | [IPv4/6] | |
> | | | | | [Data...] | |
> [IPv4/6] | ... |------ | |----------------->| Modem |
> [Data...] | | | | BAM | |
> ---------->|rmnet_data42 | [QMAP: x] |[rmnet1] | (DMA Engine) | |
> | (mux-id: 42)| [IPv4/6] |... unused! | | |
> | | [Data...] |[rmnet7] | | |
> | | | | | |
> +-------------+ +------------+ +-------+
>
> In this case, rmnet1-7 would remain unused. The firmware used on the most
> recent SoCs with BAM-DMUX even seems to announce only a single BAM-DMUX
> channel (rmnet0), which makes QMAP the only option for multiplexing there.
>
> So far the driver is mainly tested on various smartphones/tablets based on
> Qualcomm MSM8916/MSM8974 without QMAP. It looks like QMAP depends on a MTU
> negotiation feature in BAM-DMUX which is not yet supported by the driver.
>
> Signed-off-by: Stephan Gerhold <[email protected]>
> ---
> Note that this is my first network driver, so I apologize in advance
> if I made some obvious mistakes. :)
>
> I'm not sure how to integrate the driver with the WWAN subsystem yet.
> At the moment the driver creates network interfaces for all channels
> announced by the modem, it does not make use of the WWAN link management
> yet. Unfortunately, this is a bit complicated:
>
> Both QMAP and the built-in multiplexing layer might be needed at some point.
> There are firmware versions that do not support QMAP and the other way around
> (the built-in multiplexing was disabled on very recent firmware versions).
> Only userspace can check if QMAP is supported in the firmware (via QMI).

I am not very familiar with the Qualcomm protocols and am just curious
whether BAM-DMUX has any control (management) channels or only IPv4/v6
data channels?

The WWAN subsystem began as a framework for exporting management
interfaces (MBIM, AT, etc.) to user space. And then the network
interfaces (data channels) management interface was added to
facilitate management of devices with multiple data channels. That is
why I am curious about the BAM-DMUX device management interface or in
other words, how a user space application could control the modem
work?

> I could ignore QMAP completely for now but I think someone will show up
> who will need this eventually. And if there is going to be common code for
> QMAP/rmnet link management it would be nice if BAM-DMUX could also make
> use of it.
>
> But the question is, how could this look like? How do we know if we should
> create a link for QMAP or a BAM-DMUX channel? Does it even make sense
> to manage the 1-8 channels via the WWAN link management?
>
> Another problem is that the WWAN subsystem currently creates all network
> interfaces below the common WWAN device. This means that userspace like
> ModemManager has no way to check which driver provides them. This is
> necessary though to decide how to set it up via QMI (ModemManager uses it).
>
> For reference, example of the channels announced by firmwares on various SoCs:
> - Qualcomm MSM8974: channel 0-7, QMAP not supported
> - Qualcomm MSM8916: channel 0-7, QMAP usually supported, but not always
> (depends on firmware version)
> - Qualcomm MSM8937: channel 0 only, QMAP required for multiplexing(?)
> (Note: This one is theoretic based on logs, this was not tested so far...)

--
Sergey

2021-07-21 13:18:53

by Stephan Gerhold

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hi Sergey,

On Tue, Jul 20, 2021 at 12:10:42PM +0300, Sergey Ryazanov wrote:
> On Mon, Jul 19, 2021 at 6:01 PM Stephan Gerhold <[email protected]> wrote:
> > The BAM Data Multiplexer provides access to the network data channels of
> > modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916 or
> > MSM8974. It is built using a simple protocol layer on top of a DMA engine
> > (Qualcomm BAM) and bidirectional interrupts to coordinate power control.
> >
> > The modem announces a fixed set of channels by sending an OPEN command.
> > The driver exports each channel as separate network interface so that
> > a connection can be established via QMI from userspace. The network
> > interface can work either in Ethernet or Raw-IP mode (configurable via
> > QMI). However, Ethernet mode seems to be broken with most firmwares
> > (network packets are actually received as Raw-IP), therefore the driver
> > only supports Raw-IP mode.
> >
> > The driver uses runtime PM to coordinate power control with the modem.
> > TX/RX buffers are put in a kind of "ring queue" and submitted via
> > the bam_dma driver of the DMAEngine subsystem.
> >
> > The basic architecture looks roughly like this:
> >
> > +------------+ +-------+
> > [IPv4/6] | BAM-DMUX | | |
> > [Data...] | | | |
> > ---------->|rmnet0 | [DMUX chan: x] | |
> > [IPv4/6] | (chan: 0) | [IPv4/6] | |
> > [Data...] | | [Data...] | |
> > ---------->|rmnet1 |--------------->| Modem |
> > | (chan: 1) | BAM | |
> > [IPv4/6] | ... | (DMA Engine) | |
> > [Data...] | | | |
> > ---------->|rmnet7 | | |
> > | (chan: 7) | | |
> > +------------+ +-------+
> >
> > However, on newer SoCs/firmware versions Qualcomm began gradually moving
> > to QMAP (rmnet driver) as backend-independent protocol for multiplexing
> > and data aggegration. Some firmware versions allow using QMAP on top of
> > BAM-DMUX (effectively resulting in a second multiplexing layer plus data
> > aggregation). The architecture with QMAP would look roughly like this:
> >
> > +-------------+ +------------+ +-------+
> > [IPv4/6] | RMNET | | BAM-DMUX | | |
> > [Data...] | | | | [DMUX chan: 0] | |
> > ---------->|rmnet_data1 | ----->|rmnet0 | [QMAP mux-id: x] | |
> > | (mux-id: 1) | | | (chan: 0) | [IPv4/6] | |
> > | | | | | [Data...] | |
> > [IPv4/6] | ... |------ | |----------------->| Modem |
> > [Data...] | | | | BAM | |
> > ---------->|rmnet_data42 | [QMAP: x] |[rmnet1] | (DMA Engine) | |
> > | (mux-id: 42)| [IPv4/6] |... unused! | | |
> > | | [Data...] |[rmnet7] | | |
> > | | | | | |
> > +-------------+ +------------+ +-------+
> >
> > In this case, rmnet1-7 would remain unused. The firmware used on the most
> > recent SoCs with BAM-DMUX even seems to announce only a single BAM-DMUX
> > channel (rmnet0), which makes QMAP the only option for multiplexing there.
> >
> > So far the driver is mainly tested on various smartphones/tablets based on
> > Qualcomm MSM8916/MSM8974 without QMAP. It looks like QMAP depends on a MTU
> > negotiation feature in BAM-DMUX which is not yet supported by the driver.
> >
> > Signed-off-by: Stephan Gerhold <[email protected]>
> > ---
> > Note that this is my first network driver, so I apologize in advance
> > if I made some obvious mistakes. :)
> >
> > I'm not sure how to integrate the driver with the WWAN subsystem yet.
> > At the moment the driver creates network interfaces for all channels
> > announced by the modem, it does not make use of the WWAN link management
> > yet. Unfortunately, this is a bit complicated:
> >
> > Both QMAP and the built-in multiplexing layer might be needed at some point.
> > There are firmware versions that do not support QMAP and the other way around
> > (the built-in multiplexing was disabled on very recent firmware versions).
> > Only userspace can check if QMAP is supported in the firmware (via QMI).
>
> I am not very familiar with the Qualcomm protocols and am just curious
> whether BAM-DMUX has any control (management) channels or only IPv4/v6
> data channels?
>
> The WWAN subsystem began as a framework for exporting management
> interfaces (MBIM, AT, etc.) to user space. And then the network
> interfaces (data channels) management interface was added to
> facilitate management of devices with multiple data channels. That is
> why I am curious about the BAM-DMUX device management interface or in
> other words, how a user space application could control the modem
> work?
>

Sorry for the confusion! It's briefly mentioned in the Kconfig option
but I should have made this more clear in the commit message. It was so
long already that I wasn't sure where to put it. :)

BAM-DMUX does not have any control channels. Instead I use it together
with the rpmsg_wwan_ctrl driver [1] that I already submitted for 5.14.
The control/data channels are pretty much separate in this setup and
don't have much to do with each other.

I also had a short overview of some of the many different modem
protocols Qualcomm has come up with in a related RFC for that driver,
see [2] if you are curious.

I hope that clarifies some things, please let me know if I should
explain something better! :)

Thanks!
Stephan

[1]: https://lore.kernel.org/netdev/[email protected]/
[2]: https://lore.kernel.org/netdev/[email protected]/

2021-07-22 14:52:46

by Stephan Gerhold

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On Mon, Jul 19, 2021 at 05:13:32PM -0600, Jeffrey Hugo wrote:
> On 7/19/2021 12:23 PM, Stephan Gerhold wrote:
> > On Mon, Jul 19, 2021 at 09:43:27AM -0600, Jeffrey Hugo wrote:
> > > On Mon, Jul 19, 2021 at 9:01 AM Stephan Gerhold <[email protected]> wrote:
> > > >
> > > > The BAM Data Multiplexer provides access to the network data channels
> > > > of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
> > > > or MSM8974. This series adds a driver that allows using it.
> > > >
> > > > For more information about BAM-DMUX, see PATCH 4/4.
> > > >
> > > > Shortly said, BAM-DMUX is built using a simple protocol layer on top of
> > > > a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
> > > > a quite strange mode that I call "remote power collapse", where the
> > > > modem/remote side is responsible for powering on the BAM when needed but we
> > > > are responsible to initialize it. The BAM is power-collapsed when unneeded
> > > > by coordinating power control via bidirectional interrupts from the
> > > > BAM-DMUX driver.
> > >
> > > The hardware is physically located on the modem, and tied to the modem
> > > regulators, etc. The modem has the ultimate "off" switch. However,
> > > due to the BAM architecture (which is complicated), configuration uses
> > > cooperation on both ends.
> > >
> >
> > What I find strange is that it wasn't done similarly to e.g. Slimbus
> > which has a fairly similar setup. (I used that driver as inspiration for
> > how to use the mainline qcom_bam driver instead of the "SPS" from
> > downstream.)
> >
> > Slimbus uses qcom,controlled-remotely together with the LPASS
> > remoteproc, so it looks like there LPASS does both power-collapse
> > and initialization of the BAM. Whereas here the modem does the
> > power-collapse but we're supposed to do the initialization.
>
> I suspect I don't have a satisfactory answer for you. The teams that did
> slimbus were not the teams involved in the bam_dmux, and the two didn't talk
> to each-other. The bam_dmux side wasn't aware of the slimbus situation, at
> the time. I don't know if the slimbus folks knew about bam_dmux. If you
> have two silos working independently, its unlikely they will create exactly
> the same solution.
>

Fair enough :)

> >
> > It's somewhat unrelated to this initial patch set since I'm not using
> > QMAP at the moment, but I'm quite confused about the "MTU negotiation
> > feature" that you added support for in [1]. (I *think* that is you,
> > right?) :)
>
> Yes. Do I owe you for some brain damage? :)
>

A bit to be absolutely honest. :D
But I was able to ignore this feature so far so it was not much of
a problem. ;)

> >
> > The part that I somewhat understand is the "signal" sent in the "OPEN"
> > command from the modem. It tells us the maximum buffer size the modem
> > is willing to accept for TX packets ("ul_mtu" in that commit).
> >
> > Similarly, if we send "OPEN" to the modem we make the modem aware
> > of our maximum RX buffer size plus the number of RX buffers.
> > (create_open_signal() function).
> >
> > The part that is confusing me is the way the "dynamic MTU" is
> > enabled/disabled based on the "signal" in "DATA" commands as well.
> > (process_dynamic_mtu() function). When would that happen? The code
> > suggests that the modem might just suddenly announce that the large
> > MTU should be used from now on. But the "buffer_size" is only changed
> > for newly queued RX buffers so I'm not even sure how the modem knows
> > that it can now send more data at once.
> >
> > Any chance you could clarify how this should work exactly?
>
> So, I think some of this might make more sense after my response to question
> #2.
>

Indeed, I was worried that you wouldn't be able to answer the second
one, otherwise I would probably have asked it first. I'll reorder the
mail because it's clearer:

> > And a second question if you don't mind: What kind of hardware block
> > am I actually talking to here? I say "modem" above but I just know about
> > the BAM and the DMUX protocol layer. I have also seen assertion failures
> > of the modem DSP firmware if I implement something incorrectly.
> >
> > Is the DMUX protocol just some firmware concept or actually something
> > understood by some hardware block? I've also often seen mentions of some
> > "A2" hardware block but I have no idea what that actually is. What's
> > even worse, in a really old kernel A2/BAM-DMUX also appears as part of
> > the IPA driver [2], and I thought IPA is the new thing after BAM-DMUX...
>
> A2 predates IPA. IPA is essentially an evolution of A2.
>
> Sit down son, let me tell you the history of the world :)
>
> A long time ago, there was only a single processor that did both the "modem"
> and the "apps". We generally would call these the 6K days as that was the
> number of the chips (6XXX). Then it was decided that the roles of Apps and
> Modem should be separated into two different cores. The modem, handling more
> "real time" things, and apps, being more "general purpose". This started
> with the 7K series.
>
> However, this created a problem as data from a data call may need to be
> consumed by the modem, or the apps, and it wouldn't be clear until the
> packet headers were inspected, where the packet needed to be routed to.
> Sometimes this was handled on apps, sometimes on modem. Usually via a fully
> featured IP stack.
>
> With LTE, software couldn't really keep up, and so a hardware engine to
> parse the fields and route the package based on programmed filters was
> implemented. This is the "Algorithm Accelerator", aka AA, aka A2.
>
> The A2 first appeared on the 9600 chip, which was originally intended for
> Gobi- those dongles you could plug into your laptop to give it a data
> connection on the go when there was no wifi. It was then coupled with both
> 7x30 and 8660 in what we would call "fusion" to create the first LTE capable
> phones (HTC thunderbolt is the product I recall) until an integrated
> solution could come along.
>
> That integrated solution was 8960.
>
> Back to the fusion solution for a second, the 9600 was connected to the
> 7x30/8660 via SDIO. Prior to this, the data call control and data path was
> all in chip via SMD. Each rmnet instance had its own SMD channel, so
> essentially its own physical pipe. With SDIO and 9600, there were not
> enough lanes, so we invented SDIO_CMUX and SDIO_DMUX - the Control and Data
> multiplexers over SDIO.
>
> With 8960, everything was integrated again, so we could run the control path
> over SMD and didn't need a mux. However, the A2 moved from the 9600 modem
> to the 8960 integrated modem, and now we had a direct connection to its BAM.
> Again, the BAM had a limited number of physical pipes, so we needed a data
> multiplexer again. Thus SDIO_DMUX evolved into BAM_DMUX.
>
> The A2 is a hardware block with an attached BAM, that "hangs off" the modem.
> There is a software component that also runs on the modem, but in general is
> limited to configuration. Processing of data is expected to be all in
> hardware. As I think I mentioned, the A2 is a hardware engine that routes
> IP packets based on programmed filters.
>
> BAM instances (as part of the smart peripheral subsystem or SPS) can either
> be out in the system, or attached to a peripheral. The A2 BAM is attached
> to the A2 peripheral. BAM instances can run in one of 3 modes - BAM-to-BAM,
> BAM-to-System, or System-to-System. BAM-to-BAM is two BAM instances talking
> to eachother. If the USB controller has a BAM, and the A2 has a BAM, those
> two BAMS could talk directly to copy data between the A2 and USB hardware
> blocks without software interaction (after some configuration). "System"
> means system memory, or DDR. Bam-to-System is the mode the A2 BAM runs in
> where it takes data to/from DDR and gives/takes that data with the A2.
> System-to-System would be used by a BAM instance not associated with any
> peripheral to transfer data say from Apps DDR to Modem DDR.
>
> The A2 can get data from the RF interface, and determine if that needs to go
> to some modem consumer, the apps processor, or on some chips to the wifi
> processor. All in hardware, much faster than software for multiple reasons,
> but mainly because multiple filters can be evaluated in parallel, each
> filter looking at multiple fields in parallel. In a nutshell, the IPA is a
> revised A2 that is not associated with any processor (like the modem), which
> allows it to route data better (think wifi and audio usecases).
>
> Hope that all helps. I'm "around" for more questions.
>

Wow, I can't thank you enough for all the detailed explanations!
I've seen many small hints of this in various places but I could never
really understand how they all relate to each other.
This is much clearer now. :)

> I don't know how much of this translates to modern platforms. I don't
> really work on MSMs anymore, but I can convey what I recall and how things
> were "back then"
>
> So, essentially the change you are looking at is the bam_dmux portion of an
> overall feature for improving the performance of what was known as "tethered
> rmnet".
>
> Per my understanding (which the documentation of this feature reinforces),
> teathered rmnet was chiefly a test feature. Your "data" (websites, email,
> etc) could be consumed by the device itself, or exported off, if you
> teathered your phone to a laptop so that the laptop could use the phone's
> data connection. There ends up being 3 implementations for this.
>
> Consuming the data on the phone would route it to the IP stack via the rmnet
> driver.
>
> Consuming the data on an external device could take one of 2 routes.
>
> Android would use the "native" routing of the Linux IP stack to essentially
> NAT the laptop. The data would go to the rmnet driver, to the IP stack, and
> the IP stack would route it to USB.
>
> The other route is that the data could be routed directly to USB. This is
> "teathered rmnet". In the case of bam_dmux platforms, the USB stack is a
> client of bam_dmux.
>
> Teathered rmnet was never an end-user usecase.
>

I'm pretty sure it's actively used now on typical USB modems based on
MDM9607. As far as I know that one has BAM-DMUX and "forwards" it via
USB (without NAT).

> It was essentially a validation feature for both internal testing, and
> also qualifying the device with the carriers. The carriers knew that
> Android teathering involved NAT based routing on the phone, and wanted
> to figure out if the phone could meet the raw performance specs of the
> RF technology (LTE Category 4 in this case) in a tethered scenario,
> without the routing.
>
> For tethered rmnet, USB (at the time) was having issues consistently meeting
> those data rates (50mbps UL, 100mbps DL concurrently, if I recall
> correctly). So, the decided solution was to implement QMAP aggregation.
>
> A QMAP "call" over tethered rmnet would be negotiated between the app on the
> PC, and "dataservices" or "DS" on the modem. One of the initial steps of
> that negotitation causes DS to tell A2 software that QMAP over tethered
> rmnet is being activated. That would trigger A2 to activate the
> process_dynamic_mtu() code path. Now bam_dmux would allocate future RX
> buffers of the increased size which could handle the aggregated packets. I
> think the part that is confusing you is, what about the already queued
> buffers that are of the old size? Well, essentially those get consumed by
> the rest of the QMAP call negotiation, so by the time actual aggregated data
> is going to be sent from Modem to bam_dmux, the pool has been consumed and
> refilled.
>
> When the tethered rmnet connection is "brought down", DS notifies A2, and A2
> stops requesting the larger buffers.
>

Hmm, is this "DS" on the modem something special I don't know about?
It sounds like the part of the modem that I talk to via QMI to establish
new connections. However, since QMI does not go through BAM-DMUX
(RPMSG/SMD or QRTR instead) there should be only very few packets sent
via BAM-DMUX during negotation of QMAP.

To be sure I just tried QMAP with my BAM-DMUX driver again. It's been
quite some time since I tried it and it turns out this causes even more
"brain damage" than I could even remember. :D For reference:

1. First I need to set the modem to QMAP mode, this works e.g. with
qmicli -pd /dev/wwan0qmi0 \
--wda-set-data-format="link-layer-protocol=raw-ip,ul-protocol=qmap,dl-protocol=qmap,dl-datagram-max-size=4096"

However, it's important that my BAM-DMUX driver OPENs the channel
before doing this (together with announcing support for the "dynamic
MTU" feature). Otherwise the modem hangs forever and stops responding
to any QMI messages. This doesn't happen when switching to Raw-IP mode.

2. With QMAP, the struct bam_dmux_hdr->len is always set to 0xffff (65535)
instead of the actual packet length, which means my current driver
just drops those packets ("Data larger than buffer? (65535 > 4088)").

This is also handled in your commit (you get the size from the SPS
driver instead), but the bam_dma driver in mainline currently does
not have this feature. :/

3. I sent some ping packets but never got the signal to "enable large
MTU" from the modem. Something is still strange here. :/

Given all these complications (that are not present when ignoring QMAP)
I would generally agree with you that it's not worth supporting this:

> Since this not something an end user should ever exercise, you may want to
> consider dropping it.
>

Personally, I have indeed no need for it. I just suspect someone might
want this eventually for one of the following two use cases:

1. Multiplexing on new firmwares: AFAICT there is only one BAM-DMUX
channel on recent firmware versions (e.g. MSM8937/MDM9607). In that
case multiple connections are only possible through the multiplexing
layer in QMAP. I've been told the multiplexing is actually useful and
necessary in some cases (maybe it was for some MMS configurations,
I don't remember exactly).

2. USB tethering: I know some people are working on mainline Linux
for some MDM9607-based USB modems and they will probably want the
weird USB tethering feature at some point.

But all in all given all the trouble involved when making QMAP work
I think I will just ignore that feature for now and wait until someone
shows up who absolutely needs this feature...

Thanks again for all the explanations!
Stephan

2021-07-22 15:43:03

by Stephan Gerhold

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hi Loic,

On Mon, Jul 19, 2021 at 06:01:33PM +0200, Loic Poulain wrote:
> On Mon, 19 Jul 2021 at 17:01, Stephan Gerhold <[email protected]> wrote:
> >
> > I'm not sure how to integrate the driver with the WWAN subsystem yet.
> > At the moment the driver creates network interfaces for all channels
> > announced by the modem, it does not make use of the WWAN link management
> > yet. Unfortunately, this is a bit complicated:
> >
> > Both QMAP and the built-in multiplexing layer might be needed at some point.
> > There are firmware versions that do not support QMAP and the other way around
> > (the built-in multiplexing was disabled on very recent firmware versions).
> > Only userspace can check if QMAP is supported in the firmware (via QMI).
> >
> > I could ignore QMAP completely for now but I think someone will show up
> > who will need this eventually. And if there is going to be common code for
> > QMAP/rmnet link management it would be nice if BAM-DMUX could also make
> > use of it.
>
> I have this on my TODO list for mhi-net QMAP.
>

Great, thanks!

> > But the question is, how could this look like? How do we know if we should
> > create a link for QMAP or a BAM-DMUX channel? Does it even make sense
> > to manage the 1-8 channels via the WWAN link management?
>
> Couldn't it be specified via dts (property or different compatible
> string)?

It would probably work in most cases, but I have to admit that I would
prefer to avoid this for the following reason: This driver is used on
some smartphones that have different variants for different parts of the
world. As far as Linux is concerned the hardware is pretty much
identical, but the modem firmware is often somewhat device-specific.

This means that the same device tree is often used with different
firmware versions. Perhaps we are lucky enough that the firmware
versions have the same capabilities, but I'm not fully sure about that.

I think at the end the situation is fairly similar to qmi_wwan/USB.
There the kernel also does not know if the modem supports QMAP or not.
The way it's solved there at the moment is that ModemManager tries to
enable it from user space and then the mode of the network interface
can be switched through a sysfs file ("qmi/pass_through").

Something like this should probably also work in my case. This should
also allow me to ignore QMAP for now and deal with it if someone really
needs it at some point since it's quite complicated for BAM-DMUX.
(I tried QMAP again today and listed the problems in [1] for reference,
but it's all BAM-DMUX specific...)

[1] https://lore.kernel.org/netdev/[email protected]/

>
> would it make sense to have two drivers (with common core) to
> manage either the multi-bam channel or newer QMAP based single
> bam-channel modems.
>

There should be fairly little difference between those two usage modes,
so I don't think it's worth splitting the driver for this. Actually
right now (ignoring the link management of the WWAN subsystem),
it's already possible to use both.

I can use the network interfaces as-is in Raw-IP mode or I do
"sudo ip link add link rmnet0 name rmnet0_qmap type rmnet mux_id 1"
on top and use QMAP. The BAM-DMUX driver does not care, because it
just hands over sent/received packets as-is and the modem data format
must be always configured via QMI from user space.

> >
> > Another problem is that the WWAN subsystem currently creates all network
> > interfaces below the common WWAN device. This means that userspace like
> > ModemManager has no way to check which driver provides them. This is
> > necessary though to decide how to set it up via QMI (ModemManager uses it).
>
> Well, I have quite a similar concern since I'm currently porting
> mhi-net mbim to wwan framework, and I was thinking about not making
> wwan device parent of the network link/netdev (in the same way as
> wlan0 is not child of ieee80211 device), but not sure if it's a good
> idea or not since we can not really consider driver name part of the
> uapi.
>

Hm, I think the main disadvantage of that would be that the network
interface is no longer directly related to the WWAN device, right?
Userspace would then need some special matching to find the network
interfaces that belong to a certain control port.

With the current setup, e.g. ModemManager can simply match the WWAN
device and then look at its children and find the control port and
network interfaces. How would it find the network interfaces if they are
no longer below the WWAN device?

> The way links are created is normally abstracted, so if you know which
> bam variant you have from wwan network driver side (e.g. via dts), you
> should have nothing to check on the user side, except the session id.

In a perfect world it would probably be like this, but I'm afraid the
Qualcomm firmware situation isn't as simple. User space needs to know
which setup it is dealing with because all the setup happens via QMI.

Let's take the BAM-DMUX channels vs QMAP mux-IDs for example:

First, user space needs to configure the data format. This happens with
the QMI WDA (Wireless Data Administrative Service) "Set Data Format"
message. Parameter would be link layer format (Raw-IP in both cases)
but also the uplink/downlink data aggregation protocol. This is either
one of many QMAP versions (qmap|qmapv2|qmapv3|qmapv4|qmapv5), or simply
"none" when using BAM-DMUX without QMAP.

Then, the "session ID" (= BAM-DMUX channel or QMAP mux-ID) must be bound
to a WDS (Wireless Data Service) session. The QMI message for that is
different for BAM-DMUX and QMAP:

- BAM-DMUX: WDS "Bind Data Port"
(Parameter: SIO port number, can be derived from channel ID)

- QMAP: WDS "Bind MUX Data Port" (note the "MUX", different message!)
(Parameter: MUX ID, port type (USB/embedded/...), port number)

My point here: Since userspace is responsible for QMI at the moment
we will definitely need to make it aware of the setup that it needs to
apply. Just having an abstract "session ID" won't be enough to set up
the connection properly. :/

Thanks!
Stephan

2021-07-24 10:24:57

by Sergey Ryazanov

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hello Stephan,

On Wed, Jul 21, 2021 at 3:17 PM Stephan Gerhold <[email protected]> wrote:
> On Tue, Jul 20, 2021 at 12:10:42PM +0300, Sergey Ryazanov wrote:
>> On Mon, Jul 19, 2021 at 6:01 PM Stephan Gerhold <[email protected]> wrote:
>>> The BAM Data Multiplexer provides access to the network data channels of
>>> modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916 or
>>> MSM8974. It is built using a simple protocol layer on top of a DMA engine
>>> (Qualcomm BAM) and bidirectional interrupts to coordinate power control.
>>>
>>> The modem announces a fixed set of channels by sending an OPEN command.
>>> The driver exports each channel as separate network interface so that
>>> a connection can be established via QMI from userspace. The network
>>> interface can work either in Ethernet or Raw-IP mode (configurable via
>>> QMI). However, Ethernet mode seems to be broken with most firmwares
>>> (network packets are actually received as Raw-IP), therefore the driver
>>> only supports Raw-IP mode.
>>>
>>> The driver uses runtime PM to coordinate power control with the modem.
>>> TX/RX buffers are put in a kind of "ring queue" and submitted via
>>> the bam_dma driver of the DMAEngine subsystem.
>>>
>>> The basic architecture looks roughly like this:
>>>
>>> +------------+ +-------+
>>> [IPv4/6] | BAM-DMUX | | |
>>> [Data...] | | | |
>>> ---------->|rmnet0 | [DMUX chan: x] | |
>>> [IPv4/6] | (chan: 0) | [IPv4/6] | |
>>> [Data...] | | [Data...] | |
>>> ---------->|rmnet1 |--------------->| Modem |
>>> | (chan: 1) | BAM | |
>>> [IPv4/6] | ... | (DMA Engine) | |
>>> [Data...] | | | |
>>> ---------->|rmnet7 | | |
>>> | (chan: 7) | | |
>>> +------------+ +-------+
>>>
>>> However, on newer SoCs/firmware versions Qualcomm began gradually moving
>>> to QMAP (rmnet driver) as backend-independent protocol for multiplexing
>>> and data aggegration. Some firmware versions allow using QMAP on top of
>>> BAM-DMUX (effectively resulting in a second multiplexing layer plus data
>>> aggregation). The architecture with QMAP would look roughly like this:
>>>
>>> +-------------+ +------------+ +-------+
>>> [IPv4/6] | RMNET | | BAM-DMUX | | |
>>> [Data...] | | | | [DMUX chan: 0] | |
>>> ---------->|rmnet_data1 | ----->|rmnet0 | [QMAP mux-id: x] | |
>>> | (mux-id: 1) | | | (chan: 0) | [IPv4/6] | |
>>> | | | | | [Data...] | |
>>> [IPv4/6] | ... |------ | |----------------->| Modem |
>>> [Data...] | | | | BAM | |
>>> ---------->|rmnet_data42 | [QMAP: x] |[rmnet1] | (DMA Engine) | |
>>> | (mux-id: 42)| [IPv4/6] |... unused! | | |
>>> | | [Data...] |[rmnet7] | | |
>>> | | | | | |
>>> +-------------+ +------------+ +-------+
>>>
>>> In this case, rmnet1-7 would remain unused. The firmware used on the most
>>> recent SoCs with BAM-DMUX even seems to announce only a single BAM-DMUX
>>> channel (rmnet0), which makes QMAP the only option for multiplexing there.
>>>
>>> So far the driver is mainly tested on various smartphones/tablets based on
>>> Qualcomm MSM8916/MSM8974 without QMAP. It looks like QMAP depends on a MTU
>>> negotiation feature in BAM-DMUX which is not yet supported by the driver.
>>>
>>> Signed-off-by: Stephan Gerhold <[email protected]>
>>> ---
>>> Note that this is my first network driver, so I apologize in advance
>>> if I made some obvious mistakes. :)
>>>
>>> I'm not sure how to integrate the driver with the WWAN subsystem yet.
>>> At the moment the driver creates network interfaces for all channels
>>> announced by the modem, it does not make use of the WWAN link management
>>> yet. Unfortunately, this is a bit complicated:
>>>
>>> Both QMAP and the built-in multiplexing layer might be needed at some point.
>>> There are firmware versions that do not support QMAP and the other way around
>>> (the built-in multiplexing was disabled on very recent firmware versions).
>>> Only userspace can check if QMAP is supported in the firmware (via QMI).
>>
>> I am not very familiar with the Qualcomm protocols and am just curious
>> whether BAM-DMUX has any control (management) channels or only IPv4/v6
>> data channels?
>>
>> The WWAN subsystem began as a framework for exporting management
>> interfaces (MBIM, AT, etc.) to user space. And then the network
>> interfaces (data channels) management interface was added to
>> facilitate management of devices with multiple data channels. That is
>> why I am curious about the BAM-DMUX device management interface or in
>> other words, how a user space application could control the modem
>> work?
>
> Sorry for the confusion! It's briefly mentioned in the Kconfig option
> but I should have made this more clear in the commit message. It was so
> long already that I wasn't sure where to put it. :)
>
> BAM-DMUX does not have any control channels. Instead I use it together
> with the rpmsg_wwan_ctrl driver [1] that I already submitted for 5.14.
> The control/data channels are pretty much separate in this setup and
> don't have much to do with each other.
>
> I also had a short overview of some of the many different modem
> protocols Qualcomm has come up with in a related RFC for that driver,
> see [2] if you are curious.
>
> I hope that clarifies some things, please let me know if I should
> explain something better! :)
>
> [1]: https://lore.kernel.org/netdev/[email protected]/
> [2]: https://lore.kernel.org/netdev/[email protected]/

Many thanks for such informative clarification, especially for
pointing me to the rpmsg_wwan_ctrl driver. I saw it, but by a some
reason I did not link it to BAM-DMUX. Reading these links in
conjunction with your parallel talks make the situation much more
clear. I could not say that "I know kung fu", but I can say that now I
know how complex kung fu is.

--
Sergey

2021-07-24 11:27:37

by Sergey Ryazanov

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On Thu, Jul 22, 2021 at 6:40 PM Stephan Gerhold <[email protected]> wrote:
> On Mon, Jul 19, 2021 at 06:01:33PM +0200, Loic Poulain wrote:
>> On Mon, 19 Jul 2021 at 17:01, Stephan Gerhold <[email protected]> wrote:
>>> I'm not sure how to integrate the driver with the WWAN subsystem yet.
>>> At the moment the driver creates network interfaces for all channels
>>> announced by the modem, it does not make use of the WWAN link management
>>> yet. Unfortunately, this is a bit complicated:
>>>
>>> Both QMAP and the built-in multiplexing layer might be needed at some point.
>>> There are firmware versions that do not support QMAP and the other way around
>>> (the built-in multiplexing was disabled on very recent firmware versions).
>>> Only userspace can check if QMAP is supported in the firmware (via QMI).
>>>
>>> I could ignore QMAP completely for now but I think someone will show up
>>> who will need this eventually. And if there is going to be common code for
>>> QMAP/rmnet link management it would be nice if BAM-DMUX could also make
>>> use of it.
>>
>> I have this on my TODO list for mhi-net QMAP.
>
> Great, thanks!
>
>>> But the question is, how could this look like? How do we know if we should
>>> create a link for QMAP or a BAM-DMUX channel? Does it even make sense
>>> to manage the 1-8 channels via the WWAN link management?
>>
>> Couldn't it be specified via dts (property or different compatible
>> string)?
>
> It would probably work in most cases, but I have to admit that I would
> prefer to avoid this for the following reason: This driver is used on
> some smartphones that have different variants for different parts of the
> world. As far as Linux is concerned the hardware is pretty much
> identical, but the modem firmware is often somewhat device-specific.
>
> This means that the same device tree is often used with different
> firmware versions. Perhaps we are lucky enough that the firmware
> versions have the same capabilities, but I'm not fully sure about that.
>
> I think at the end the situation is fairly similar to qmi_wwan/USB.
> There the kernel also does not know if the modem supports QMAP or not.
> The way it's solved there at the moment is that ModemManager tries to
> enable it from user space and then the mode of the network interface
> can be switched through a sysfs file ("qmi/pass_through").
>
> Something like this should probably also work in my case. This should
> also allow me to ignore QMAP for now and deal with it if someone really
> needs it at some point since it's quite complicated for BAM-DMUX.
> (I tried QMAP again today and listed the problems in [1] for reference,
> but it's all BAM-DMUX specific...)
>
> [1] https://lore.kernel.org/netdev/[email protected]/
>
>> would it make sense to have two drivers (with common core) to
>> manage either the multi-bam channel or newer QMAP based single
>> bam-channel modems.
>
> There should be fairly little difference between those two usage modes,
> so I don't think it's worth splitting the driver for this. Actually
> right now (ignoring the link management of the WWAN subsystem),
> it's already possible to use both.
>
> I can use the network interfaces as-is in Raw-IP mode or I do
> "sudo ip link add link rmnet0 name rmnet0_qmap type rmnet mux_id 1"
> on top and use QMAP. The BAM-DMUX driver does not care, because it
> just hands over sent/received packets as-is and the modem data format
> must be always configured via QMI from user space.
>
>>> Another problem is that the WWAN subsystem currently creates all network
>>> interfaces below the common WWAN device. This means that userspace like
>>> ModemManager has no way to check which driver provides them. This is
>>> necessary though to decide how to set it up via QMI (ModemManager uses it).
>>
>> Well, I have quite a similar concern since I'm currently porting
>> mhi-net mbim to wwan framework, and I was thinking about not making
>> wwan device parent of the network link/netdev (in the same way as
>> wlan0 is not child of ieee80211 device), but not sure if it's a good
>> idea or not since we can not really consider driver name part of the
>> uapi.
>
> Hm, I think the main disadvantage of that would be that the network
> interface is no longer directly related to the WWAN device, right?
> Userspace would then need some special matching to find the network
> interfaces that belong to a certain control port.
>
> With the current setup, e.g. ModemManager can simply match the WWAN
> device and then look at its children and find the control port and
> network interfaces. How would it find the network interfaces if they are
> no longer below the WWAN device?
>
> > The way links are created is normally abstracted, so if you know which
> > bam variant you have from wwan network driver side (e.g. via dts), you
> > should have nothing to check on the user side, except the session id.
>
> In a perfect world it would probably be like this, but I'm afraid the
> Qualcomm firmware situation isn't as simple. User space needs to know
> which setup it is dealing with because all the setup happens via QMI.
>
> Let's take the BAM-DMUX channels vs QMAP mux-IDs for example:
>
> First, user space needs to configure the data format. This happens with
> the QMI WDA (Wireless Data Administrative Service) "Set Data Format"
> message. Parameter would be link layer format (Raw-IP in both cases)
> but also the uplink/downlink data aggregation protocol. This is either
> one of many QMAP versions (qmap|qmapv2|qmapv3|qmapv4|qmapv5), or simply
> "none" when using BAM-DMUX without QMAP.
>
> Then, the "session ID" (= BAM-DMUX channel or QMAP mux-ID) must be bound
> to a WDS (Wireless Data Service) session. The QMI message for that is
> different for BAM-DMUX and QMAP:
>
> - BAM-DMUX: WDS "Bind Data Port"
> (Parameter: SIO port number, can be derived from channel ID)
>
> - QMAP: WDS "Bind MUX Data Port" (note the "MUX", different message!)
> (Parameter: MUX ID, port type (USB/embedded/...), port number)
>
> My point here: Since userspace is responsible for QMI at the moment
> we will definitely need to make it aware of the setup that it needs to
> apply. Just having an abstract "session ID" won't be enough to set up
> the connection properly. :/

Stephan, Loic, I have a polemic question related to a drivers model
that we should build to smoothly support qualcomm hardware by the
kernel. I would depict the situation as I see it and then ask the
question. Please correct me if I am misunderstanding something or
simply wrong. Or maybe you will be gracious once more and point me to
earlier discussions :)

We always talk that a userspace software should take care of
multiplexing configuration to make data communication possible at all.
The motivation here is simple - management protocol (QMI) is complex,
userspace software must implement it anyway to manage network
connectivity, so why not implement the multiplexing management there
too?

This way the userspace software that should simply command a "modem"
to establish a data connection and poll a "modem" for a signal level
became a self contained device manager that knows all modem-to-host
interconnection details and even must to perform an initial
modem-to-host interfaces negotiation and configuration. The last task
is what userspace software usually expects to be performed by an OS
kernel.

But what if we implement the QMI multiplexing management part in the
kernel? This way the kernel will take care about modem-to-host
communication protocols and interfaces, and provides userspace with a
single WWAN device (possibly with multiple network and network
management interfaces).

I do not propose to fully implement QMI protocol inside the kernel,
but implement only a mux management part, while passing all other
messages between a "modem" and a userspace software as-is.

What pros and cons of such a design do you see?

--
Sergey

2021-07-26 08:13:24

by Aleksander Morgado

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hey!

>
> But what if we implement the QMI multiplexing management part in the
> kernel? This way the kernel will take care about modem-to-host
> communication protocols and interfaces, and provides userspace with a
> single WWAN device (possibly with multiple network and network
> management interfaces).
>
> I do not propose to fully implement QMI protocol inside the kernel,
> but implement only a mux management part, while passing all other
> messages between a "modem" and a userspace software as-is.
>
> What pros and cons of such a design do you see?
>

The original GobiNet driver already provided some QMI protocol
implementation in the driver itself. In addition to initial device
setup as you suggest, it also allowed userspace applications to
allocate and release QMI clients for the different services that could
be used independently by different processes. Not going to say that
was the wrong way to do it, but the implementation is definitely not
simple. The decision taken in qmi_wwan to make the driver as simple as
possible and leave all the QMI management to userspace was quite an
important one; it made the driver extremely simple, leaving all the
complexity of managing the protocol to userspace, and while it had
some initial drawbacks (e.g. only one process could talk QMI at a
time) the userspace tools have evolved to avoid them (e.g. the
qmi-proxy).

I wrote some time ago about this, maybe it's still relevant today:
Blogpost https://sigquit.wordpress.com/2014/06/11/qmiwwan-or-gobinet/,
Article in PDF https://aleksander.es/data/Qualcomm%20Gobi%20devices%20on%20Linux.pdf

Making the driver talk QMI just for device setup would require the
kernel to know how the QMI protocol works, how QMI client allocations
and releases are done, how errors are reported, how is the format of
the requests and responses involved; it would require the kernel to
wait until the QMI protocol endpoint in the modem is capable of
returning QMI responses (this could be up to 20 or 30 secs after the
device is available in the bus), it would require to have possibly
some specific rules on how the QMI clients are managed after a
suspend/resume operation. It would also require to sync the access to
the CTL service, which is the one running QMI service allocations and
releases, so that both kernel and userspace can perform operations
with that service at the same time. It would need to know how
different QMI capable devices behave, because not all devices support
the same services, and some don't even support the WDA service that
would be the one needed to setup data aggregation. There is definitely
some overlap on what the kernel could do and what userspace could do,
and I'd say that we have much more flexibility in userspace to do all
this leaving all the complexity out of the kernel driver.

ModemManager already provides a unified API to e.g. setup multiplexed
data sessions, regardless of what the underlying kernel implementation
is (qmi_wwan only, qmi_wwan+rmnet, ipa+rmnet, bam-dmux, cdc_mbim...) .
The logic doing all that is extremely complex and possibly full of
errors, I would definitely not want to have all that logic in the
kernel itself, let the errors be in userspace! Unifying stuff in the
kernel is a good idea, but if you ask me, it should be done in a way
that is as simple as possible, leaving complexity to userspace, even
if that means that userspace still needs to know what type of device
we have behind the wwan subsystem, because userspace will anyway need
to know all that.

--
Aleksander
https://aleksander.es

2021-07-26 15:02:35

by Jeffrey Hugo

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 0/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

On 7/22/2021 8:51 AM, Stephan Gerhold wrote:
> On Mon, Jul 19, 2021 at 05:13:32PM -0600, Jeffrey Hugo wrote:
>> On 7/19/2021 12:23 PM, Stephan Gerhold wrote:
>>> On Mon, Jul 19, 2021 at 09:43:27AM -0600, Jeffrey Hugo wrote:
>>>> On Mon, Jul 19, 2021 at 9:01 AM Stephan Gerhold <[email protected]> wrote:
>>>>>
>>>>> The BAM Data Multiplexer provides access to the network data channels
>>>>> of modems integrated into many older Qualcomm SoCs, e.g. Qualcomm MSM8916
>>>>> or MSM8974. This series adds a driver that allows using it.
>>>>>
>>>>> For more information about BAM-DMUX, see PATCH 4/4.
>>>>>
>>>>> Shortly said, BAM-DMUX is built using a simple protocol layer on top of
>>>>> a DMA engine (Qualcomm BAM DMA). For BAM-DMUX, the BAM DMA engine runs in
>>>>> a quite strange mode that I call "remote power collapse", where the
>>>>> modem/remote side is responsible for powering on the BAM when needed but we
>>>>> are responsible to initialize it. The BAM is power-collapsed when unneeded
>>>>> by coordinating power control via bidirectional interrupts from the
>>>>> BAM-DMUX driver.
>>>>
>>>> The hardware is physically located on the modem, and tied to the modem
>>>> regulators, etc. The modem has the ultimate "off" switch. However,
>>>> due to the BAM architecture (which is complicated), configuration uses
>>>> cooperation on both ends.
>>>>
>>>
>>> What I find strange is that it wasn't done similarly to e.g. Slimbus
>>> which has a fairly similar setup. (I used that driver as inspiration for
>>> how to use the mainline qcom_bam driver instead of the "SPS" from
>>> downstream.)
>>>
>>> Slimbus uses qcom,controlled-remotely together with the LPASS
>>> remoteproc, so it looks like there LPASS does both power-collapse
>>> and initialization of the BAM. Whereas here the modem does the
>>> power-collapse but we're supposed to do the initialization.
>>
>> I suspect I don't have a satisfactory answer for you. The teams that did
>> slimbus were not the teams involved in the bam_dmux, and the two didn't talk
>> to each-other. The bam_dmux side wasn't aware of the slimbus situation, at
>> the time. I don't know if the slimbus folks knew about bam_dmux. If you
>> have two silos working independently, its unlikely they will create exactly
>> the same solution.
>>
>
> Fair enough :)
>
>>>
>>> It's somewhat unrelated to this initial patch set since I'm not using
>>> QMAP at the moment, but I'm quite confused about the "MTU negotiation
>>> feature" that you added support for in [1]. (I *think* that is you,
>>> right?) :)
>>
>> Yes. Do I owe you for some brain damage? :)
>>
>
> A bit to be absolutely honest. :D
> But I was able to ignore this feature so far so it was not much of
> a problem. ;)
>
>>>
>>> The part that I somewhat understand is the "signal" sent in the "OPEN"
>>> command from the modem. It tells us the maximum buffer size the modem
>>> is willing to accept for TX packets ("ul_mtu" in that commit).
>>>
>>> Similarly, if we send "OPEN" to the modem we make the modem aware
>>> of our maximum RX buffer size plus the number of RX buffers.
>>> (create_open_signal() function).
>>>
>>> The part that is confusing me is the way the "dynamic MTU" is
>>> enabled/disabled based on the "signal" in "DATA" commands as well.
>>> (process_dynamic_mtu() function). When would that happen? The code
>>> suggests that the modem might just suddenly announce that the large
>>> MTU should be used from now on. But the "buffer_size" is only changed
>>> for newly queued RX buffers so I'm not even sure how the modem knows
>>> that it can now send more data at once.
>>>
>>> Any chance you could clarify how this should work exactly?
>>
>> So, I think some of this might make more sense after my response to question
>> #2.
>>
>
> Indeed, I was worried that you wouldn't be able to answer the second
> one, otherwise I would probably have asked it first. I'll reorder the
> mail because it's clearer:
>
>>> And a second question if you don't mind: What kind of hardware block
>>> am I actually talking to here? I say "modem" above but I just know about
>>> the BAM and the DMUX protocol layer. I have also seen assertion failures
>>> of the modem DSP firmware if I implement something incorrectly.
>>>
>>> Is the DMUX protocol just some firmware concept or actually something
>>> understood by some hardware block? I've also often seen mentions of some
>>> "A2" hardware block but I have no idea what that actually is. What's
>>> even worse, in a really old kernel A2/BAM-DMUX also appears as part of
>>> the IPA driver [2], and I thought IPA is the new thing after BAM-DMUX...
>>
>> A2 predates IPA. IPA is essentially an evolution of A2.
>>
>> Sit down son, let me tell you the history of the world :)
>>
>> A long time ago, there was only a single processor that did both the "modem"
>> and the "apps". We generally would call these the 6K days as that was the
>> number of the chips (6XXX). Then it was decided that the roles of Apps and
>> Modem should be separated into two different cores. The modem, handling more
>> "real time" things, and apps, being more "general purpose". This started
>> with the 7K series.
>>
>> However, this created a problem as data from a data call may need to be
>> consumed by the modem, or the apps, and it wouldn't be clear until the
>> packet headers were inspected, where the packet needed to be routed to.
>> Sometimes this was handled on apps, sometimes on modem. Usually via a fully
>> featured IP stack.
>>
>> With LTE, software couldn't really keep up, and so a hardware engine to
>> parse the fields and route the package based on programmed filters was
>> implemented. This is the "Algorithm Accelerator", aka AA, aka A2.
>>
>> The A2 first appeared on the 9600 chip, which was originally intended for
>> Gobi- those dongles you could plug into your laptop to give it a data
>> connection on the go when there was no wifi. It was then coupled with both
>> 7x30 and 8660 in what we would call "fusion" to create the first LTE capable
>> phones (HTC thunderbolt is the product I recall) until an integrated
>> solution could come along.
>>
>> That integrated solution was 8960.
>>
>> Back to the fusion solution for a second, the 9600 was connected to the
>> 7x30/8660 via SDIO. Prior to this, the data call control and data path was
>> all in chip via SMD. Each rmnet instance had its own SMD channel, so
>> essentially its own physical pipe. With SDIO and 9600, there were not
>> enough lanes, so we invented SDIO_CMUX and SDIO_DMUX - the Control and Data
>> multiplexers over SDIO.
>>
>> With 8960, everything was integrated again, so we could run the control path
>> over SMD and didn't need a mux. However, the A2 moved from the 9600 modem
>> to the 8960 integrated modem, and now we had a direct connection to its BAM.
>> Again, the BAM had a limited number of physical pipes, so we needed a data
>> multiplexer again. Thus SDIO_DMUX evolved into BAM_DMUX.
>>
>> The A2 is a hardware block with an attached BAM, that "hangs off" the modem.
>> There is a software component that also runs on the modem, but in general is
>> limited to configuration. Processing of data is expected to be all in
>> hardware. As I think I mentioned, the A2 is a hardware engine that routes
>> IP packets based on programmed filters.
>>
>> BAM instances (as part of the smart peripheral subsystem or SPS) can either
>> be out in the system, or attached to a peripheral. The A2 BAM is attached
>> to the A2 peripheral. BAM instances can run in one of 3 modes - BAM-to-BAM,
>> BAM-to-System, or System-to-System. BAM-to-BAM is two BAM instances talking
>> to eachother. If the USB controller has a BAM, and the A2 has a BAM, those
>> two BAMS could talk directly to copy data between the A2 and USB hardware
>> blocks without software interaction (after some configuration). "System"
>> means system memory, or DDR. Bam-to-System is the mode the A2 BAM runs in
>> where it takes data to/from DDR and gives/takes that data with the A2.
>> System-to-System would be used by a BAM instance not associated with any
>> peripheral to transfer data say from Apps DDR to Modem DDR.
>>
>> The A2 can get data from the RF interface, and determine if that needs to go
>> to some modem consumer, the apps processor, or on some chips to the wifi
>> processor. All in hardware, much faster than software for multiple reasons,
>> but mainly because multiple filters can be evaluated in parallel, each
>> filter looking at multiple fields in parallel. In a nutshell, the IPA is a
>> revised A2 that is not associated with any processor (like the modem), which
>> allows it to route data better (think wifi and audio usecases).
>>
>> Hope that all helps. I'm "around" for more questions.
>>
>
> Wow, I can't thank you enough for all the detailed explanations!
> I've seen many small hints of this in various places but I could never
> really understand how they all relate to each other.
> This is much clearer now. :)
>
>> I don't know how much of this translates to modern platforms. I don't
>> really work on MSMs anymore, but I can convey what I recall and how things
>> were "back then"
>>
>> So, essentially the change you are looking at is the bam_dmux portion of an
>> overall feature for improving the performance of what was known as "tethered
>> rmnet".
>>
>> Per my understanding (which the documentation of this feature reinforces),
>> teathered rmnet was chiefly a test feature. Your "data" (websites, email,
>> etc) could be consumed by the device itself, or exported off, if you
>> teathered your phone to a laptop so that the laptop could use the phone's
>> data connection. There ends up being 3 implementations for this.
>>
>> Consuming the data on the phone would route it to the IP stack via the rmnet
>> driver.
>>
>> Consuming the data on an external device could take one of 2 routes.
>>
>> Android would use the "native" routing of the Linux IP stack to essentially
>> NAT the laptop. The data would go to the rmnet driver, to the IP stack, and
>> the IP stack would route it to USB.
>>
>> The other route is that the data could be routed directly to USB. This is
>> "teathered rmnet". In the case of bam_dmux platforms, the USB stack is a
>> client of bam_dmux.
>>
>> Teathered rmnet was never an end-user usecase.
>>
>
> I'm pretty sure it's actively used now on typical USB modems based on
> MDM9607. As far as I know that one has BAM-DMUX and "forwards" it via
> USB (without NAT).
>
>> It was essentially a validation feature for both internal testing, and
>> also qualifying the device with the carriers. The carriers knew that
>> Android teathering involved NAT based routing on the phone, and wanted
>> to figure out if the phone could meet the raw performance specs of the
>> RF technology (LTE Category 4 in this case) in a tethered scenario,
>> without the routing.
>>
>> For tethered rmnet, USB (at the time) was having issues consistently meeting
>> those data rates (50mbps UL, 100mbps DL concurrently, if I recall
>> correctly). So, the decided solution was to implement QMAP aggregation.
>>
>> A QMAP "call" over tethered rmnet would be negotiated between the app on the
>> PC, and "dataservices" or "DS" on the modem. One of the initial steps of
>> that negotitation causes DS to tell A2 software that QMAP over tethered
>> rmnet is being activated. That would trigger A2 to activate the
>> process_dynamic_mtu() code path. Now bam_dmux would allocate future RX
>> buffers of the increased size which could handle the aggregated packets. I
>> think the part that is confusing you is, what about the already queued
>> buffers that are of the old size? Well, essentially those get consumed by
>> the rest of the QMAP call negotiation, so by the time actual aggregated data
>> is going to be sent from Modem to bam_dmux, the pool has been consumed and
>> refilled.
>>
>> When the tethered rmnet connection is "brought down", DS notifies A2, and A2
>> stops requesting the larger buffers.
>>
>
> Hmm, is this "DS" on the modem something special I don't know about?
> It sounds like the part of the modem that I talk to via QMI to establish
> new connections.

You have the gist of it. Kinda need to dance around here :(

> However, since QMI does not go through BAM-DMUX
> (RPMSG/SMD or QRTR instead) there should be only very few packets sent
> via BAM-DMUX during negotation of QMAP.
>
> To be sure I just tried QMAP with my BAM-DMUX driver again. It's been
> quite some time since I tried it and it turns out this causes even more
> "brain damage" than I could even remember. :D For reference:
>
> 1. First I need to set the modem to QMAP mode, this works e.g. with
> qmicli -pd /dev/wwan0qmi0 \
> --wda-set-data-format="link-layer-protocol=raw-ip,ul-protocol=qmap,dl-protocol=qmap,dl-datagram-max-size=4096"
>
> However, it's important that my BAM-DMUX driver OPENs the channel
> before doing this (together with announcing support for the "dynamic
> MTU" feature). Otherwise the modem hangs forever and stops responding
> to any QMI messages. This doesn't happen when switching to Raw-IP mode.
>
> 2. With QMAP, the struct bam_dmux_hdr->len is always set to 0xffff (65535)
> instead of the actual packet length, which means my current driver
> just drops those packets ("Data larger than buffer? (65535 > 4088)").
>
> This is also handled in your commit (you get the size from the SPS
> driver instead), but the bam_dma driver in mainline currently does
> not have this feature. :/

Huh. It seems really odd to me that the client doesn't get "notified"
of the actual length of data transferred. That could easily be less
than the buffer provided, so there isn't a way for the client to derive
that info.

>
> 3. I sent some ping packets but never got the signal to "enable large
> MTU" from the modem. Something is still strange here. :/
>
> Given all these complications (that are not present when ignoring QMAP)
> I would generally agree with you that it's not worth supporting this:
>
>> Since this not something an end user should ever exercise, you may want to
>> consider dropping it.
>>
>
> Personally, I have indeed no need for it. I just suspect someone might
> want this eventually for one of the following two use cases:
>
> 1. Multiplexing on new firmwares: AFAICT there is only one BAM-DMUX
> channel on recent firmware versions (e.g. MSM8937/MDM9607). In that
> case multiple connections are only possible through the multiplexing
> layer in QMAP. I've been told the multiplexing is actually useful and
> necessary in some cases (maybe it was for some MMS configurations,
> I don't remember exactly).
>
> 2. USB tethering: I know some people are working on mainline Linux
> for some MDM9607-based USB modems and they will probably want the
> weird USB tethering feature at some point.
>
> But all in all given all the trouble involved when making QMAP work
> I think I will just ignore that feature for now and wait until someone
> shows up who absolutely needs this feature...

QMAP is useful. It finally gets rid of the need to have multiple
physical pipes that exist because of "reasons". Sadly, it came about a
decade late IMO. Regardless, I think I agree. Focus on what you really
care about, and leave everything else until later. Thankfully layering
makes that easier. Otherwise, nothing gets done.

2021-07-26 22:41:27

by Sergey Ryazanov

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 4/4] net: wwan: Add Qualcomm BAM-DMUX WWAN network driver

Hello Aleksander,

On Mon, Jul 26, 2021 at 11:11 AM Aleksander Morgado
<[email protected]> wrote:
>> But what if we implement the QMI multiplexing management part in the
>> kernel? This way the kernel will take care about modem-to-host
>> communication protocols and interfaces, and provides userspace with a
>> single WWAN device (possibly with multiple network and network
>> management interfaces).
>>
>> I do not propose to fully implement QMI protocol inside the kernel,
>> but implement only a mux management part, while passing all other
>> messages between a "modem" and a userspace software as-is.
>>
>> What pros and cons of such a design do you see?
>
> The original GobiNet driver already provided some QMI protocol
> implementation in the driver itself. In addition to initial device
> setup as you suggest, it also allowed userspace applications to
> allocate and release QMI clients for the different services that could
> be used independently by different processes. Not going to say that
> was the wrong way to do it, but the implementation is definitely not
> simple. The decision taken in qmi_wwan to make the driver as simple as
> possible and leave all the QMI management to userspace was quite an
> important one; it made the driver extremely simple, leaving all the
> complexity of managing the protocol to userspace, and while it had
> some initial drawbacks (e.g. only one process could talk QMI at a
> time) the userspace tools have evolved to avoid them (e.g. the
> qmi-proxy).
>
> I wrote some time ago about this, maybe it's still relevant today:
> Blogpost https://sigquit.wordpress.com/2014/06/11/qmiwwan-or-gobinet/,
> Article in PDF https://aleksander.es/data/Qualcomm%20Gobi%20devices%20on%20Linux.pdf
>
> Making the driver talk QMI just for device setup would require the
> kernel to know how the QMI protocol works, how QMI client allocations
> and releases are done, how errors are reported, how is the format of
> the requests and responses involved; it would require the kernel to
> wait until the QMI protocol endpoint in the modem is capable of
> returning QMI responses (this could be up to 20 or 30 secs after the
> device is available in the bus), it would require to have possibly
> some specific rules on how the QMI clients are managed after a
> suspend/resume operation. It would also require to sync the access to
> the CTL service, which is the one running QMI service allocations and
> releases, so that both kernel and userspace can perform operations
> with that service at the same time. It would need to know how
> different QMI capable devices behave, because not all devices support
> the same services, and some don't even support the WDA service that
> would be the one needed to setup data aggregation. There is definitely
> some overlap on what the kernel could do and what userspace could do,
> and I'd say that we have much more flexibility in userspace to do all
> this leaving all the complexity out of the kernel driver.
>
> ModemManager already provides a unified API to e.g. setup multiplexed
> data sessions, regardless of what the underlying kernel implementation
> is (qmi_wwan only, qmi_wwan+rmnet, ipa+rmnet, bam-dmux, cdc_mbim...) .
> The logic doing all that is extremely complex and possibly full of
> errors, I would definitely not want to have all that logic in the
> kernel itself, let the errors be in userspace! Unifying stuff in the
> kernel is a good idea, but if you ask me, it should be done in a way
> that is as simple as possible, leaving complexity to userspace, even
> if that means that userspace still needs to know what type of device
> we have behind the wwan subsystem, because userspace will anyway need
> to know all that.

Ouch! All these QMI internals are like a can of worms. Each time I
start thinking that I learned something I face another complexity.
Many thanks for your detailed reply and for your blogpost, for me it
was quite helpful for understanding to see a side by side comparison
of approaches!

The argument for keeping drivers minimalistic to keep the system
stable sounds reasonable. But I am still feeling uncomfortable when a
userspace software manages a device at such a low level. Maybe it is a
matter of taste, or maybe I still do not realize the whole complexity.
Anyway, in the context of your clarification, I should be more careful
in the future with calls to implement QMI in the kernel :)

--
Sergey

2021-07-29 19:39:03

by Rob Herring (Arm)

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 1/4] dt-bindings: dmaengine: bam_dma: Add remote power collapse mode

On Mon, Jul 19, 2021 at 04:53:14PM +0200, Stephan Gerhold wrote:
> In some configurations, the BAM DMA controller is set up by a remote
> processor and the local processor can simply start making use of it
> without setting up the BAM. This is already supported using the
> "qcom,controlled-remotely" property.
>
> However, for some reason another possible configuration is that the
> remote processor is responsible for powering up the BAM, but we are
> still responsible for initializing it (e.g. resetting it etc). Add
> a "qcom,remote-power-collapse" property to describe that configuration.
>
> Signed-off-by: Stephan Gerhold <[email protected]>
> ---
> NOTE: This is *not* a compile-time requirement for the BAM-DMUX driver
> so this could also go through the dmaengine tree.
>
> Also note that there is an ongoing effort to convert these bindings
> to DT schema but sadly there were not any updates for a while. :/
> https://lore.kernel.org/linux-arm-msm/[email protected]/
> ---
> Documentation/devicetree/bindings/dma/qcom_bam_dma.txt | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> index cf5b9e44432c..362a4f0905a8 100644
> --- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> +++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> @@ -15,6 +15,8 @@ Required properties:
> the secure world.
> - qcom,controlled-remotely : optional, indicates that the bam is controlled by
> remote proccessor i.e. execution environment.
> +- qcom,remote-power-collapse : optional, indicates that the bam is powered up by
> + a remote processor but must be initialized by the local processor.

Wouldn't 'qcom,remote-power' or 'qcom,remote-powered' be sufficient? I
don't understand what 'collapse' means here. Doesn't sound good though.

Rob

2021-07-29 19:52:12

by Stephan Gerhold

[permalink] [raw]

Subject: Re: [RFC PATCH net-next 1/4] dt-bindings: dmaengine: bam_dma: Add remote power collapse mode

On Thu, Jul 29, 2021 at 01:36:31PM -0600, Rob Herring wrote:
> On Mon, Jul 19, 2021 at 04:53:14PM +0200, Stephan Gerhold wrote:
> > In some configurations, the BAM DMA controller is set up by a remote
> > processor and the local processor can simply start making use of it
> > without setting up the BAM. This is already supported using the
> > "qcom,controlled-remotely" property.
> >
> > However, for some reason another possible configuration is that the
> > remote processor is responsible for powering up the BAM, but we are
> > still responsible for initializing it (e.g. resetting it etc). Add
> > a "qcom,remote-power-collapse" property to describe that configuration.
> >
> > Signed-off-by: Stephan Gerhold <[email protected]>
> > ---
> > NOTE: This is *not* a compile-time requirement for the BAM-DMUX driver
> > so this could also go through the dmaengine tree.
> >
> > Also note that there is an ongoing effort to convert these bindings
> > to DT schema but sadly there were not any updates for a while. :/
> > https://lore.kernel.org/linux-arm-msm/[email protected]/
> > ---
> > Documentation/devicetree/bindings/dma/qcom_bam_dma.txt | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> > index cf5b9e44432c..362a4f0905a8 100644
> > --- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> > +++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> > @@ -15,6 +15,8 @@ Required properties:
> > the secure world.
> > - qcom,controlled-remotely : optional, indicates that the bam is controlled by
> > remote proccessor i.e. execution environment.
> > +- qcom,remote-power-collapse : optional, indicates that the bam is powered up by
> > + a remote processor but must be initialized by the local processor.
>
> Wouldn't 'qcom,remote-power' or 'qcom,remote-powered' be sufficient? I
> don't understand what 'collapse' means here. Doesn't sound good though.
>

Yeah I can't think of any significant meaning of the "collapse" part
for the bindings, I probably just picked it up somewhere while trying to
find some information about how the BAM DMUX setup works. :)

Just one question, would you prefer "qcom,remote-powered" or rather
"qcom,powered-remotely" for consistency with the existing
"qcom,controlled-remotely"? Both sounds fine to me.

Thanks!
Stephan