2024-03-19 08:14:19

by Shengjiu Wang

[permalink] [raw]
Subject: [PATCH v15 00/16] Add audio support in v4l2 framework

Audio signal processing also has the requirement for memory to
memory similar as Video.

This asrc memory to memory (memory ->asrc->memory) case is a non
real time use case.

User fills the input buffer to the asrc module, after conversion, then asrc
sends back the output buffer to user. So it is not a traditional ALSA playback
and capture case.

It is a specific use case, there is no reference in current kernel.
v4l2 memory to memory is the closed implementation, v4l2 current
support video, image, radio, tuner, touch devices, so it is not
complicated to add support for this specific audio case.

Because we had implemented the "memory -> asrc ->i2s device-> codec"
use case in ALSA. Now the "memory->asrc->memory" needs
to reuse the code in asrc driver, so the first 3 patches is for refining
the code to make it can be shared by the "memory->asrc->memory"
driver.

The main change is in the v4l2 side, A /dev/vl4-audioX will be created,
user applications only use the ioctl of v4l2 framework.

Other change is to add memory to memory support for two kinds of i.MX ASRC
module.

changes in v15:
- update MAINTAINERS for imx-asrc.c and vim2m-audio.c

changes in v14:
- document the reservation of 'AUXX' fourcc format.
- add v4l2_audfmt_to_fourcc() definition.

changes in v13
- change 'pixelformat' to 'audioformat' in dev-audio-mem2mem.rst
- add more description for clock drift in ext-ctrls-audio-m2m.rst
- Add "media: v4l2-ctrls: add support for fraction_bits" from Hans
to avoid build issue for kernel test robot

changes in v12
- minor changes according to comments
- drop min_buffers_needed = 1 and V4L2_CTRL_FLAG_UPDATE flag
- drop bus_info

changes in v11
- add add-fixed-point-test-controls in vivid.
- add v4l2_ctrl_fp_compose() helper function for min and max

changes in v10
- remove FIXED_POINT type
- change code base on media: v4l2-ctrls: add support for fraction_bits
- fix issue reported by kernel test robot
- remove module_alias

changes in v9:
- add MEDIA_ENT_F_PROC_AUDIO_RESAMPLER.
- add MEDIA_INTF_T_V4L_AUDIO
- add media controller support
- refine the vim2m-audio to support 8k<->16k conversion.

changes in v8:
- refine V4L2_CAP_AUDIO_M2M to be 0x00000008
- update doc for FIXED_POINT
- address comments for imx-asrc

changes in v7:
- add acked-by from Mark
- separate commit for fixed point, m2m audio class, audio rate controls
- use INTEGER_MENU for rate, FIXED_POINT for rate offset
- remove used fmts
- address other comments for Hans

changes in v6:
- use m2m_prepare/m2m_unprepare/m2m_start/m2m_stop to replace
m2m_start_part_one/m2m_stop_part_one, m2m_start_part_two/m2m_stop_part_two.
- change V4L2_CTRL_TYPE_ASRC_RATE to V4L2_CTRL_TYPE_FIXED_POINT
- fix warning by kernel test rebot
- remove some unused format V4L2_AUDIO_FMT_XX
- Get SNDRV_PCM_FORMAT from V4L2_AUDIO_FMT in driver.
- rename audm2m to viaudm2m.

changes in v5:
- remove V4L2_AUDIO_FMT_LPCM
- define audio pixel format like V4L2_AUDIO_FMT_S8...
- remove rate and format in struct v4l2_audio_format.
- Add V4L2_CID_ASRC_SOURCE_RATE and V4L2_CID_ASRC_DEST_RATE controls
- updata document accordingly.

changes in v4:
- update document style
- separate V4L2_AUDIO_FMT_LPCM and V4L2_CAP_AUDIO_M2M in separate commit

changes in v3:
- Modify documents for adding audio m2m support
- Add audio virtual m2m driver
- Defined V4L2_AUDIO_FMT_LPCM format type for audio.
- Defined V4L2_CAP_AUDIO_M2M capability type for audio m2m case.
- with modification in v4l-utils, pass v4l2-compliance test.

changes in v2:
- decouple the implementation in v4l2 and ALSA
- implement the memory to memory driver as a platfrom driver
and move it to driver/media
- move fsl_asrc_common.h to include/sound folder

Hans Verkuil (1):
media: v4l2-ctrls: add support for fraction_bits

Shengjiu Wang (15):
ASoC: fsl_asrc: define functions for memory to memory usage
ASoC: fsl_easrc: define functions for memory to memory usage
ASoC: fsl_asrc: move fsl_asrc_common.h to include/sound
ASoC: fsl_asrc: register m2m platform device
ASoC: fsl_easrc: register m2m platform device
media: uapi: Add V4L2_CAP_AUDIO_M2M capability flag
media: v4l2: Add audio capture and output support
media: uapi: Define audio sample format fourcc type
media: uapi: Add V4L2_CTRL_CLASS_M2M_AUDIO
media: uapi: Add audio rate controls support
media: uapi: Declare interface types for Audio
media: uapi: Add an entity type for audio resampler
media: vivid: add fixed point test controls
media: imx-asrc: Add memory to memory driver
media: vim2m-audio: add virtual driver for audio memory to memory

.../media/mediactl/media-types.rst | 11 +
.../userspace-api/media/v4l/buffer.rst | 6 +
.../userspace-api/media/v4l/common.rst | 1 +
.../media/v4l/dev-audio-mem2mem.rst | 71 +
.../userspace-api/media/v4l/devices.rst | 1 +
.../media/v4l/ext-ctrls-audio-m2m.rst | 59 +
.../userspace-api/media/v4l/pixfmt-audio.rst | 100 ++
.../userspace-api/media/v4l/pixfmt.rst | 1 +
.../media/v4l/vidioc-enum-fmt.rst | 2 +
.../media/v4l/vidioc-g-ext-ctrls.rst | 4 +
.../userspace-api/media/v4l/vidioc-g-fmt.rst | 4 +
.../media/v4l/vidioc-querycap.rst | 3 +
.../media/v4l/vidioc-queryctrl.rst | 11 +-
.../media/videodev2.h.rst.exceptions | 3 +
MAINTAINERS | 17 +
.../media/common/videobuf2/videobuf2-v4l2.c | 4 +
drivers/media/platform/nxp/Kconfig | 13 +
drivers/media/platform/nxp/Makefile | 1 +
drivers/media/platform/nxp/imx-asrc.c | 1256 +++++++++++++++++
drivers/media/test-drivers/Kconfig | 10 +
drivers/media/test-drivers/Makefile | 1 +
drivers/media/test-drivers/vim2m-audio.c | 793 +++++++++++
drivers/media/test-drivers/vivid/vivid-core.h | 2 +
.../media/test-drivers/vivid/vivid-ctrls.c | 26 +
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 9 +
drivers/media/v4l2-core/v4l2-ctrls-api.c | 1 +
drivers/media/v4l2-core/v4l2-ctrls-core.c | 93 +-
drivers/media/v4l2-core/v4l2-ctrls-defs.c | 10 +
drivers/media/v4l2-core/v4l2-dev.c | 21 +
drivers/media/v4l2-core/v4l2-ioctl.c | 66 +
drivers/media/v4l2-core/v4l2-mem2mem.c | 13 +-
include/media/v4l2-ctrls.h | 13 +-
include/media/v4l2-dev.h | 2 +
include/media/v4l2-ioctl.h | 34 +
.../fsl => include/sound}/fsl_asrc_common.h | 60 +
include/uapi/linux/media.h | 2 +
include/uapi/linux/v4l2-controls.h | 9 +
include/uapi/linux/videodev2.h | 50 +-
sound/soc/fsl/fsl_asrc.c | 144 ++
sound/soc/fsl/fsl_asrc.h | 4 +-
sound/soc/fsl/fsl_asrc_dma.c | 2 +-
sound/soc/fsl/fsl_easrc.c | 233 +++
sound/soc/fsl/fsl_easrc.h | 6 +-
43 files changed, 3145 insertions(+), 27 deletions(-)
create mode 100644 Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst
create mode 100644 Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
create mode 100644 Documentation/userspace-api/media/v4l/pixfmt-audio.rst
create mode 100644 drivers/media/platform/nxp/imx-asrc.c
create mode 100644 drivers/media/test-drivers/vim2m-audio.c
rename {sound/soc/fsl => include/sound}/fsl_asrc_common.h (60%)

--
2.34.1



2024-03-19 08:17:11

by Shengjiu Wang

[permalink] [raw]
Subject: [PATCH v15 15/16] media: imx-asrc: Add memory to memory driver

Implement the ASRC memory to memory function using
the v4l2 framework, user can use this function with
v4l2 ioctl interface.

User send the output and capture buffer to driver and
driver store the converted data to the capture buffer.

This feature can be shared by ASRC and EASRC drivers

Signed-off-by: Shengjiu Wang <[email protected]>
---
MAINTAINERS | 8 +
drivers/media/platform/nxp/Kconfig | 13 +
drivers/media/platform/nxp/Makefile | 1 +
drivers/media/platform/nxp/imx-asrc.c | 1256 +++++++++++++++++++++++++
4 files changed, 1278 insertions(+)
create mode 100644 drivers/media/platform/nxp/imx-asrc.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 375d34363777..7b8b9ee65c61 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15821,6 +15821,14 @@ F: drivers/nvmem/
F: include/linux/nvmem-consumer.h
F: include/linux/nvmem-provider.h

+NXP ASRC V4L2 MEM2MEM DRIVERS
+M: Shengjiu Wang <[email protected]>
+L: [email protected]
+S: Maintained
+W: https://linuxtv.org
+T: git git://linuxtv.org/media_tree.git
+F: drivers/media/platform/nxp/imx-asrc.c
+
NXP BLUETOOTH WIRELESS DRIVERS
M: Amitkumar Karwar <[email protected]>
M: Neeraj Kale <[email protected]>
diff --git a/drivers/media/platform/nxp/Kconfig b/drivers/media/platform/nxp/Kconfig
index 40e3436669e2..8d0ca335601f 100644
--- a/drivers/media/platform/nxp/Kconfig
+++ b/drivers/media/platform/nxp/Kconfig
@@ -67,3 +67,16 @@ config VIDEO_MX2_EMMAPRP

source "drivers/media/platform/nxp/dw100/Kconfig"
source "drivers/media/platform/nxp/imx-jpeg/Kconfig"
+
+config VIDEO_IMX_ASRC
+ tristate "NXP i.MX ASRC M2M support"
+ depends on V4L_MEM2MEM_DRIVERS
+ depends on MEDIA_SUPPORT
+ select VIDEOBUF2_DMA_CONTIG
+ select V4L2_MEM2MEM_DEV
+ select MEDIA_CONTROLLER
+ help
+ Say Y if you want to add ASRC M2M support for NXP CPUs.
+ It is a complement for ASRC M2P and ASRC P2M features.
+ This option is only useful for out-of-tree drivers since
+ in-tree drivers select it automatically.
diff --git a/drivers/media/platform/nxp/Makefile b/drivers/media/platform/nxp/Makefile
index 4d90eb713652..1325675e34f5 100644
--- a/drivers/media/platform/nxp/Makefile
+++ b/drivers/media/platform/nxp/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_VIDEO_IMX8MQ_MIPI_CSI2) += imx8mq-mipi-csi2.o
obj-$(CONFIG_VIDEO_IMX_MIPI_CSIS) += imx-mipi-csis.o
obj-$(CONFIG_VIDEO_IMX_PXP) += imx-pxp.o
obj-$(CONFIG_VIDEO_MX2_EMMAPRP) += mx2_emmaprp.o
+obj-$(CONFIG_VIDEO_IMX_ASRC) += imx-asrc.o
diff --git a/drivers/media/platform/nxp/imx-asrc.c b/drivers/media/platform/nxp/imx-asrc.c
new file mode 100644
index 000000000000..0c25a36199b1
--- /dev/null
+++ b/drivers/media/platform/nxp/imx-asrc.c
@@ -0,0 +1,1256 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Copyright (C) 2014-2016 Freescale Semiconductor, Inc.
+// Copyright (C) 2019-2023 NXP
+//
+// Freescale ASRC Memory to Memory (M2M) driver
+
+#include <linux/dma/imx-dma.h>
+#include <linux/pm_runtime.h>
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-device.h>
+#include <media/v4l2-event.h>
+#include <media/v4l2-fh.h>
+#include <media/v4l2-ioctl.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+#include <sound/dmaengine_pcm.h>
+#include <sound/fsl_asrc_common.h>
+
+#define V4L_CAP OUT
+#define V4L_OUT IN
+
+#define ASRC_xPUT_DMA_CALLBACK(dir) \
+ (((dir) == V4L_OUT) ? asrc_input_dma_callback \
+ : asrc_output_dma_callback)
+
+#define DIR_STR(dir) (dir) == V4L_OUT ? "out" : "cap"
+
+/* Maximum output and capture buffer size */
+#define ASRC_M2M_BUFFER_SIZE (512 * 1024)
+
+/* Maximum output and capture period size */
+#define ASRC_M2M_PERIOD_SIZE (48 * 1024)
+
+struct asrc_pair_m2m {
+ struct fsl_asrc_pair *pair;
+ struct asrc_m2m *m2m;
+ struct v4l2_fh fh;
+ struct v4l2_ctrl_handler ctrl_handler;
+ int channels[2];
+ unsigned int sequence[2];
+ s64 src_rate_off_prev; /* Q31.32 */
+ s64 dst_rate_off_prev; /* Q31.32 */
+ s64 src_rate_off_cur; /* Q31.32 */
+ s64 dst_rate_off_cur; /* Q31.32 */
+};
+
+struct asrc_m2m {
+ struct fsl_asrc_m2m_pdata pdata;
+ struct v4l2_device v4l2_dev;
+ struct v4l2_m2m_dev *m2m_dev;
+ struct video_device *dec_vdev;
+ struct mutex mlock; /* v4l2 ioctls serialization */
+ struct platform_device *pdev;
+#ifdef CONFIG_MEDIA_CONTROLLER
+ struct media_device mdev;
+#endif
+};
+
+static u32 formats[] = {
+ V4L2_AUDIO_FMT_S8,
+ V4L2_AUDIO_FMT_S16_LE,
+ V4L2_AUDIO_FMT_U16_LE,
+ V4L2_AUDIO_FMT_S24_LE,
+ V4L2_AUDIO_FMT_S24_3LE,
+ V4L2_AUDIO_FMT_U24_LE,
+ V4L2_AUDIO_FMT_U24_3LE,
+ V4L2_AUDIO_FMT_S32_LE,
+ V4L2_AUDIO_FMT_U32_LE,
+ V4L2_AUDIO_FMT_S20_3LE,
+ V4L2_AUDIO_FMT_U20_3LE,
+ V4L2_AUDIO_FMT_FLOAT_LE,
+ V4L2_AUDIO_FMT_IEC958_SUBFRAME_LE,
+};
+
+#define NUM_FORMATS ARRAY_SIZE(formats)
+
+static const s64 asrc_v1_m2m_rates[] = {
+ 5512, 8000, 11025, 12000, 16000,
+ 22050, 24000, 32000, 44100,
+ 48000, 64000, 88200, 96000,
+ 128000, 176400, 192000,
+};
+
+static const s64 asrc_v2_m2m_rates[] = {
+ 8000, 11025, 12000, 16000,
+ 22050, 24000, 32000, 44100,
+ 48000, 64000, 88200, 96000,
+ 128000, 176400, 192000, 256000,
+ 352800, 384000, 705600, 768000,
+};
+
+static u32 find_fourcc(snd_pcm_format_t format)
+{
+ snd_pcm_format_t fmt;
+ unsigned int k;
+
+ for (k = 0; k < NUM_FORMATS; k++) {
+ fmt = v4l2_fourcc_to_audfmt(formats[k]);
+ if (fmt == format)
+ return formats[k];
+ }
+
+ return 0;
+}
+
+static snd_pcm_format_t find_format(u32 fourcc)
+{
+ unsigned int k;
+
+ for (k = 0; k < NUM_FORMATS; k++) {
+ if (formats[k] == fourcc)
+ return v4l2_fourcc_to_audfmt(formats[k]);
+ }
+
+ return 0;
+}
+
+static int asrc_check_format(struct asrc_pair_m2m *pair_m2m, u8 dir, u32 format)
+{
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct fsl_asrc_m2m_pdata *pdata = &m2m->pdata;
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ snd_pcm_format_t fmt;
+ u64 format_bit = 0;
+ int i;
+
+ for (i = 0; i < NUM_FORMATS; ++i) {
+ if (formats[i] == format) {
+ fmt = v4l2_fourcc_to_audfmt(formats[i]);
+ format_bit = pcm_format_to_bits(fmt);
+ break;
+ }
+ }
+
+ if (dir == IN && !(format_bit & pdata->fmt_in))
+ return find_fourcc(pair->sample_format[V4L_OUT]);
+ if (dir == OUT && !(format_bit & pdata->fmt_out))
+ return find_fourcc(pair->sample_format[V4L_CAP]);
+
+ return format;
+}
+
+static int asrc_check_channel(struct asrc_pair_m2m *pair_m2m, u8 dir, u32 channels)
+{
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct fsl_asrc_m2m_pdata *pdata = &m2m->pdata;
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+
+ if (channels < pdata->chan_min || channels > pdata->chan_max)
+ return pair->channels;
+
+ return channels;
+}
+
+static inline struct asrc_pair_m2m *asrc_m2m_fh_to_ctx(struct v4l2_fh *fh)
+{
+ return container_of(fh, struct asrc_pair_m2m, fh);
+}
+
+/**
+ * asrc_read_last_fifo: read all the remaining data from FIFO
+ * @pair: Structure pointer of fsl_asrc_pair
+ * @dma_vaddr: virtual address of capture buffer
+ * @length: payload length of capture buffer
+ */
+static void asrc_read_last_fifo(struct fsl_asrc_pair *pair, void *dma_vaddr, u32 *length)
+{
+ struct fsl_asrc *asrc = pair->asrc;
+ enum asrc_pair_index index = pair->index;
+ u32 i, reg, size, t_size = 0, width;
+ u32 *reg32 = NULL;
+ u16 *reg16 = NULL;
+ u8 *reg24 = NULL;
+
+ width = snd_pcm_format_physical_width(pair->sample_format[V4L_CAP]);
+ if (width == 32)
+ reg32 = dma_vaddr + *length;
+ else if (width == 16)
+ reg16 = dma_vaddr + *length;
+ else
+ reg24 = dma_vaddr + *length;
+retry:
+ size = asrc->get_output_fifo_size(pair);
+ if (size + *length > ASRC_M2M_BUFFER_SIZE)
+ goto end;
+
+ for (i = 0; i < size * pair->channels; i++) {
+ regmap_read(asrc->regmap, asrc->get_fifo_addr(OUT, index), &reg);
+ if (reg32) {
+ *reg32++ = reg;
+ } else if (reg16) {
+ *reg16++ = (u16)reg;
+ } else {
+ *reg24++ = (u8)reg;
+ *reg24++ = (u8)(reg >> 8);
+ *reg24++ = (u8)(reg >> 16);
+ }
+ }
+ t_size += size;
+
+ /* In case there is data left in FIFO */
+ if (size)
+ goto retry;
+end:
+ /* Update payload length */
+ if (reg32)
+ *length += t_size * pair->channels * 4;
+ else if (reg16)
+ *length += t_size * pair->channels * 2;
+ else
+ *length += t_size * pair->channels * 3;
+}
+
+static int asrc_m2m_start_streaming(struct vb2_queue *q, unsigned int count)
+{
+ struct asrc_pair_m2m *pair_m2m = vb2_get_drv_priv(q);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct fsl_asrc *asrc = pair->asrc;
+ struct device *dev = &m2m->pdev->dev;
+ struct vb2_v4l2_buffer *buf;
+ bool request_flag = false;
+ int ret;
+
+ dev_dbg(dev, "Start streaming pair=%p, %d\n", pair, q->type);
+
+ ret = pm_runtime_get_sync(dev);
+ if (ret < 0) {
+ dev_err(dev, "Failed to power up asrc\n");
+ goto err_pm_runtime;
+ }
+
+ /* Request asrc pair/context */
+ if (!pair->req_pair) {
+ /* flag for error handler of this function */
+ request_flag = true;
+
+ ret = asrc->request_pair(pair->channels, pair);
+ if (ret) {
+ dev_err(dev, "failed to request pair: %d\n", ret);
+ goto err_request_pair;
+ }
+
+ ret = asrc->m2m_prepare(pair);
+ if (ret) {
+ dev_err(dev, "failed to start pair part one: %d\n", ret);
+ goto err_start_part_one;
+ }
+
+ pair->req_pair = true;
+ }
+
+ /* Request dma channels */
+ if (V4L2_TYPE_IS_OUTPUT(q->type)) {
+ pair_m2m->sequence[V4L_OUT] = 0;
+ pair->dma_chan[V4L_OUT] = asrc->get_dma_channel(pair, IN);
+ if (!pair->dma_chan[V4L_OUT]) {
+ dev_err(dev, "[ctx%d] failed to get input DMA channel\n", pair->index);
+ ret = -EBUSY;
+ goto err_dma_channel;
+ }
+ } else {
+ pair_m2m->sequence[V4L_CAP] = 0;
+ pair->dma_chan[V4L_CAP] = asrc->get_dma_channel(pair, OUT);
+ if (!pair->dma_chan[V4L_CAP]) {
+ dev_err(dev, "[ctx%d] failed to get output DMA channel\n", pair->index);
+ ret = -EBUSY;
+ goto err_dma_channel;
+ }
+ }
+
+ v4l2_m2m_update_start_streaming_state(pair_m2m->fh.m2m_ctx, q);
+
+ return 0;
+
+err_dma_channel:
+ if (request_flag && asrc->m2m_unprepare)
+ asrc->m2m_unprepare(pair);
+err_start_part_one:
+ if (request_flag)
+ asrc->release_pair(pair);
+err_request_pair:
+ pm_runtime_put_sync(dev);
+err_pm_runtime:
+ /* Release buffers */
+ if (V4L2_TYPE_IS_OUTPUT(q->type)) {
+ while ((buf = v4l2_m2m_src_buf_remove(pair_m2m->fh.m2m_ctx)))
+ v4l2_m2m_buf_done(buf, VB2_BUF_STATE_QUEUED);
+ } else {
+ while ((buf = v4l2_m2m_dst_buf_remove(pair_m2m->fh.m2m_ctx)))
+ v4l2_m2m_buf_done(buf, VB2_BUF_STATE_QUEUED);
+ }
+ return ret;
+}
+
+static void asrc_m2m_stop_streaming(struct vb2_queue *q)
+{
+ struct asrc_pair_m2m *pair_m2m = vb2_get_drv_priv(q);
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct fsl_asrc *asrc = pair->asrc;
+ struct device *dev = &m2m->pdev->dev;
+
+ dev_dbg(dev, "Stop streaming pair=%p, %d\n", pair, q->type);
+
+ v4l2_m2m_update_stop_streaming_state(pair_m2m->fh.m2m_ctx, q);
+
+ /* Stop & release pair/context */
+ if (asrc->m2m_stop)
+ asrc->m2m_stop(pair);
+
+ if (pair->req_pair) {
+ if (asrc->m2m_unprepare)
+ asrc->m2m_unprepare(pair);
+ asrc->release_pair(pair);
+ pair->req_pair = false;
+ }
+
+ /* Release dma channel */
+ if (V4L2_TYPE_IS_OUTPUT(q->type)) {
+ if (pair->dma_chan[V4L_OUT])
+ dma_release_channel(pair->dma_chan[V4L_OUT]);
+ } else {
+ if (pair->dma_chan[V4L_CAP])
+ dma_release_channel(pair->dma_chan[V4L_CAP]);
+ }
+
+ pm_runtime_put_sync(dev);
+}
+
+static int asrc_m2m_queue_setup(struct vb2_queue *q,
+ unsigned int *num_buffers, unsigned int *num_planes,
+ unsigned int sizes[], struct device *alloc_devs[])
+{
+ struct asrc_pair_m2m *pair_m2m = vb2_get_drv_priv(q);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ u32 size;
+
+ /*
+ * The capture buffer size depends on output buffer size
+ * and the convert ratio.
+ *
+ * Here just use a fix length for capture and output buffer.
+ * User need to care about it.
+ */
+ if (V4L2_TYPE_IS_OUTPUT(q->type))
+ size = pair->buf_len[V4L_OUT];
+ else
+ size = pair->buf_len[V4L_CAP];
+
+ if (*num_planes)
+ return sizes[0] < size ? -EINVAL : 0;
+
+ *num_planes = 1;
+ sizes[0] = size;
+
+ return 0;
+}
+
+static void asrc_m2m_buf_queue(struct vb2_buffer *vb)
+{
+ struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
+ struct asrc_pair_m2m *pair_m2m = vb2_get_drv_priv(vb->vb2_queue);
+
+ /* queue buffer */
+ v4l2_m2m_buf_queue(pair_m2m->fh.m2m_ctx, vbuf);
+}
+
+static const struct vb2_ops asrc_m2m_qops = {
+ .wait_prepare = vb2_ops_wait_prepare,
+ .wait_finish = vb2_ops_wait_finish,
+ .start_streaming = asrc_m2m_start_streaming,
+ .stop_streaming = asrc_m2m_stop_streaming,
+ .queue_setup = asrc_m2m_queue_setup,
+ .buf_queue = asrc_m2m_buf_queue,
+};
+
+/* Init video buffer queue for src and dst. */
+static int asrc_m2m_queue_init(void *priv, struct vb2_queue *src_vq,
+ struct vb2_queue *dst_vq)
+{
+ struct asrc_pair_m2m *pair_m2m = priv;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ int ret;
+
+ src_vq->type = V4L2_BUF_TYPE_AUDIO_OUTPUT;
+ src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+ src_vq->drv_priv = pair_m2m;
+ src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
+ src_vq->ops = &asrc_m2m_qops;
+ src_vq->mem_ops = &vb2_dma_contig_memops;
+ src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
+ src_vq->lock = &m2m->mlock;
+ src_vq->dev = &m2m->pdev->dev;
+
+ ret = vb2_queue_init(src_vq);
+ if (ret)
+ return ret;
+
+ dst_vq->type = V4L2_BUF_TYPE_AUDIO_CAPTURE;
+ dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
+ dst_vq->drv_priv = pair_m2m;
+ dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer);
+ dst_vq->ops = &asrc_m2m_qops;
+ dst_vq->mem_ops = &vb2_dma_contig_memops;
+ dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
+ dst_vq->lock = &m2m->mlock;
+ dst_vq->dev = &m2m->pdev->dev;
+
+ ret = vb2_queue_init(dst_vq);
+ return ret;
+}
+
+static int asrc_m2m_op_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+ struct asrc_pair_m2m *pair_m2m =
+ container_of(ctrl->handler, struct asrc_pair_m2m, ctrl_handler);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ int ret = 0;
+
+ switch (ctrl->id) {
+ case V4L2_CID_M2M_AUDIO_SOURCE_RATE:
+ pair->rate[V4L_OUT] = ctrl->qmenu_int[ctrl->val];
+ break;
+ case V4L2_CID_M2M_AUDIO_DEST_RATE:
+ pair->rate[V4L_CAP] = ctrl->qmenu_int[ctrl->val];
+ break;
+ case V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET:
+ pair_m2m->src_rate_off_cur = *ctrl->p_new.p_s64;
+ break;
+ case V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET:
+ pair_m2m->dst_rate_off_cur = *ctrl->p_new.p_s64;
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+static const struct v4l2_ctrl_ops asrc_m2m_ctrl_ops = {
+ .s_ctrl = asrc_m2m_op_s_ctrl,
+};
+
+static const struct v4l2_ctrl_config asrc_src_rate_off_control = {
+ .ops = &asrc_m2m_ctrl_ops,
+ .id = V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET,
+ .name = "Audio Source Sample Rate Offset",
+ .type = V4L2_CTRL_TYPE_INTEGER64,
+ .min = v4l2_ctrl_fp_compose(-128, 0, 32),
+ .max = v4l2_ctrl_fp_compose(127, 0xffffffff, 32),
+ .def = 0,
+ .step = 1,
+ .fraction_bits = 32,
+};
+
+static const struct v4l2_ctrl_config asrc_dst_rate_off_control = {
+ .ops = &asrc_m2m_ctrl_ops,
+ .id = V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET,
+ .name = "Audio Dest Sample Rate Offset",
+ .type = V4L2_CTRL_TYPE_INTEGER64,
+ .min = v4l2_ctrl_fp_compose(-128, 0, 32),
+ .max = v4l2_ctrl_fp_compose(127, 0xffffffff, 32),
+ .def = 0,
+ .step = 1,
+ .fraction_bits = 32,
+};
+
+/* system callback for open() */
+static int asrc_m2m_open(struct file *file)
+{
+ struct asrc_m2m *m2m = video_drvdata(file);
+ struct fsl_asrc *asrc = m2m->pdata.asrc;
+ struct video_device *vdev = video_devdata(file);
+ struct fsl_asrc_pair *pair;
+ struct asrc_pair_m2m *pair_m2m;
+ int ret = 0;
+
+ if (mutex_lock_interruptible(&m2m->mlock))
+ return -ERESTARTSYS;
+
+ pair = kzalloc(sizeof(*pair) + asrc->pair_priv_size, GFP_KERNEL);
+ if (!pair) {
+ ret = -ENOMEM;
+ goto err_alloc_pair;
+ }
+
+ pair_m2m = kzalloc(sizeof(*pair_m2m), GFP_KERNEL);
+ if (!pair_m2m) {
+ ret = -ENOMEM;
+ goto err_alloc_pair_m2m;
+ }
+
+ pair->private = (void *)pair + sizeof(struct fsl_asrc_pair);
+ pair->asrc = asrc;
+
+ pair->buf_len[V4L_OUT] = ASRC_M2M_BUFFER_SIZE;
+ pair->buf_len[V4L_CAP] = ASRC_M2M_BUFFER_SIZE;
+
+ pair->channels = 2;
+ pair->rate[V4L_OUT] = 8000;
+ pair->rate[V4L_CAP] = 8000;
+ pair->sample_format[V4L_OUT] = SNDRV_PCM_FORMAT_S16_LE;
+ pair->sample_format[V4L_CAP] = SNDRV_PCM_FORMAT_S16_LE;
+
+ init_completion(&pair->complete[V4L_OUT]);
+ init_completion(&pair->complete[V4L_CAP]);
+
+ v4l2_fh_init(&pair_m2m->fh, vdev);
+ v4l2_fh_add(&pair_m2m->fh);
+ file->private_data = &pair_m2m->fh;
+
+ pair_m2m->pair = pair;
+ pair_m2m->m2m = m2m;
+ /* m2m context init */
+ pair_m2m->fh.m2m_ctx = v4l2_m2m_ctx_init(m2m->m2m_dev, pair_m2m,
+ asrc_m2m_queue_init);
+ if (IS_ERR(pair_m2m->fh.m2m_ctx)) {
+ ret = PTR_ERR(pair_m2m->fh.m2m_ctx);
+ goto err_ctx_init;
+ }
+
+ v4l2_ctrl_handler_init(&pair_m2m->ctrl_handler, 4);
+
+ if (m2m->pdata.rate_min == 5512) {
+ v4l2_ctrl_new_int_menu(&pair_m2m->ctrl_handler, &asrc_m2m_ctrl_ops,
+ V4L2_CID_M2M_AUDIO_SOURCE_RATE,
+ ARRAY_SIZE(asrc_v1_m2m_rates) - 1, 1, asrc_v1_m2m_rates);
+ v4l2_ctrl_new_int_menu(&pair_m2m->ctrl_handler, &asrc_m2m_ctrl_ops,
+ V4L2_CID_M2M_AUDIO_DEST_RATE,
+ ARRAY_SIZE(asrc_v1_m2m_rates) - 1, 1, asrc_v1_m2m_rates);
+ } else {
+ v4l2_ctrl_new_int_menu(&pair_m2m->ctrl_handler, &asrc_m2m_ctrl_ops,
+ V4L2_CID_M2M_AUDIO_SOURCE_RATE,
+ ARRAY_SIZE(asrc_v2_m2m_rates) - 1, 0, asrc_v2_m2m_rates);
+ v4l2_ctrl_new_int_menu(&pair_m2m->ctrl_handler, &asrc_m2m_ctrl_ops,
+ V4L2_CID_M2M_AUDIO_DEST_RATE,
+ ARRAY_SIZE(asrc_v2_m2m_rates) - 1, 0, asrc_v2_m2m_rates);
+ }
+
+ v4l2_ctrl_new_custom(&pair_m2m->ctrl_handler, &asrc_src_rate_off_control, NULL);
+ v4l2_ctrl_new_custom(&pair_m2m->ctrl_handler, &asrc_dst_rate_off_control, NULL);
+
+ if (pair_m2m->ctrl_handler.error) {
+ ret = pair_m2m->ctrl_handler.error;
+ v4l2_ctrl_handler_free(&pair_m2m->ctrl_handler);
+ goto err_ctrl_handler;
+ }
+
+ pair_m2m->fh.ctrl_handler = &pair_m2m->ctrl_handler;
+
+ mutex_unlock(&m2m->mlock);
+
+ return 0;
+
+err_ctrl_handler:
+ v4l2_m2m_ctx_release(pair_m2m->fh.m2m_ctx);
+err_ctx_init:
+ v4l2_fh_del(&pair_m2m->fh);
+ v4l2_fh_exit(&pair_m2m->fh);
+ kfree(pair_m2m);
+err_alloc_pair_m2m:
+ kfree(pair);
+err_alloc_pair:
+ mutex_unlock(&m2m->mlock);
+ return ret;
+}
+
+static int asrc_m2m_release(struct file *file)
+{
+ struct asrc_m2m *m2m = video_drvdata(file);
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(file->private_data);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+
+ mutex_lock(&m2m->mlock);
+ v4l2_ctrl_handler_free(&pair_m2m->ctrl_handler);
+ v4l2_m2m_ctx_release(pair_m2m->fh.m2m_ctx);
+ v4l2_fh_del(&pair_m2m->fh);
+ v4l2_fh_exit(&pair_m2m->fh);
+ kfree(pair_m2m);
+ kfree(pair);
+ mutex_unlock(&m2m->mlock);
+
+ return 0;
+}
+
+static const struct v4l2_file_operations asrc_m2m_fops = {
+ .owner = THIS_MODULE,
+ .open = asrc_m2m_open,
+ .release = asrc_m2m_release,
+ .poll = v4l2_m2m_fop_poll,
+ .unlocked_ioctl = video_ioctl2,
+ .mmap = v4l2_m2m_fop_mmap,
+};
+
+static int asrc_m2m_querycap(struct file *file, void *priv,
+ struct v4l2_capability *cap)
+{
+ strscpy(cap->driver, M2M_DRV_NAME, sizeof(cap->driver));
+ strscpy(cap->card, M2M_DRV_NAME, sizeof(cap->card));
+ cap->device_caps = V4L2_CAP_STREAMING | V4L2_CAP_AUDIO_M2M;
+
+ return 0;
+}
+
+static int enum_fmt(struct v4l2_fmtdesc *f, u64 fmtbit)
+{
+ snd_pcm_format_t fmt;
+ int i, num;
+
+ num = 0;
+
+ for (i = 0; i < NUM_FORMATS; ++i) {
+ fmt = v4l2_fourcc_to_audfmt(formats[i]);
+ if (pcm_format_to_bits(fmt) & fmtbit) {
+ if (num == f->index)
+ break;
+ /*
+ * Correct type but haven't reached our index yet,
+ * just increment per-type index
+ */
+ ++num;
+ }
+ }
+
+ if (i < NUM_FORMATS) {
+ /* Format found */
+ f->pixelformat = formats[i];
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static int asrc_m2m_enum_fmt_aud_cap(struct file *file, void *fh,
+ struct v4l2_fmtdesc *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+
+ return enum_fmt(f, m2m->pdata.fmt_out);
+}
+
+static int asrc_m2m_enum_fmt_aud_out(struct file *file, void *fh,
+ struct v4l2_fmtdesc *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+
+ return enum_fmt(f, m2m->pdata.fmt_in);
+}
+
+static int asrc_m2m_g_fmt_aud_cap(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+
+ f->fmt.audio.channels = pair->channels;
+ f->fmt.audio.buffersize = pair->buf_len[V4L_CAP];
+ f->fmt.audio.audioformat = find_fourcc(pair->sample_format[V4L_CAP]);
+
+ return 0;
+}
+
+static int asrc_m2m_g_fmt_aud_out(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+
+ f->fmt.audio.channels = pair->channels;
+ f->fmt.audio.buffersize = pair->buf_len[V4L_OUT];
+ f->fmt.audio.audioformat = find_fourcc(pair->sample_format[V4L_OUT]);
+
+ return 0;
+}
+
+/* output for asrc */
+static int asrc_m2m_s_fmt_aud_cap(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct device *dev = &m2m->pdev->dev;
+
+ f->fmt.audio.audioformat = asrc_check_format(pair_m2m, OUT, f->fmt.audio.audioformat);
+ f->fmt.audio.channels = asrc_check_channel(pair_m2m, OUT, f->fmt.audio.channels);
+
+ if (pair_m2m->channels[V4L_CAP] > 0 &&
+ pair_m2m->channels[V4L_CAP] != f->fmt.audio.channels) {
+ dev_err(dev, "channels don't match for cap and out\n");
+ return -EINVAL;
+ }
+
+ pair_m2m->channels[V4L_CAP] = f->fmt.audio.channels;
+ pair->channels = f->fmt.audio.channels;
+ pair->sample_format[V4L_CAP] = find_format(f->fmt.audio.audioformat);
+
+ return 0;
+}
+
+/* input for asrc */
+static int asrc_m2m_s_fmt_aud_out(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct device *dev = &m2m->pdev->dev;
+
+ f->fmt.audio.audioformat = asrc_check_format(pair_m2m, IN, f->fmt.audio.audioformat);
+ f->fmt.audio.channels = asrc_check_channel(pair_m2m, IN, f->fmt.audio.channels);
+ if (pair_m2m->channels[V4L_OUT] > 0 &&
+ pair_m2m->channels[V4L_OUT] != f->fmt.audio.channels) {
+ dev_err(dev, "channels don't match for cap and out\n");
+ return -EINVAL;
+ }
+
+ pair_m2m->channels[V4L_OUT] = f->fmt.audio.channels;
+ pair->channels = f->fmt.audio.channels;
+ pair->sample_format[V4L_OUT] = find_format(f->fmt.audio.audioformat);
+
+ return 0;
+}
+
+static int asrc_m2m_try_fmt_audio_cap(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+
+ f->fmt.audio.audioformat = asrc_check_format(pair_m2m, OUT, f->fmt.audio.audioformat);
+ f->fmt.audio.channels = asrc_check_channel(pair_m2m, OUT, f->fmt.audio.channels);
+
+ return 0;
+}
+
+static int asrc_m2m_try_fmt_audio_out(struct file *file, void *fh,
+ struct v4l2_format *f)
+{
+ struct asrc_pair_m2m *pair_m2m = asrc_m2m_fh_to_ctx(fh);
+
+ f->fmt.audio.audioformat = asrc_check_format(pair_m2m, IN, f->fmt.audio.audioformat);
+ f->fmt.audio.channels = asrc_check_channel(pair_m2m, IN, f->fmt.audio.channels);
+
+ return 0;
+}
+
+static const struct v4l2_ioctl_ops asrc_m2m_ioctl_ops = {
+ .vidioc_querycap = asrc_m2m_querycap,
+
+ .vidioc_enum_fmt_audio_cap = asrc_m2m_enum_fmt_aud_cap,
+ .vidioc_enum_fmt_audio_out = asrc_m2m_enum_fmt_aud_out,
+
+ .vidioc_g_fmt_audio_cap = asrc_m2m_g_fmt_aud_cap,
+ .vidioc_g_fmt_audio_out = asrc_m2m_g_fmt_aud_out,
+
+ .vidioc_s_fmt_audio_cap = asrc_m2m_s_fmt_aud_cap,
+ .vidioc_s_fmt_audio_out = asrc_m2m_s_fmt_aud_out,
+
+ .vidioc_try_fmt_audio_cap = asrc_m2m_try_fmt_audio_cap,
+ .vidioc_try_fmt_audio_out = asrc_m2m_try_fmt_audio_out,
+
+ .vidioc_qbuf = v4l2_m2m_ioctl_qbuf,
+ .vidioc_dqbuf = v4l2_m2m_ioctl_dqbuf,
+
+ .vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs,
+ .vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf,
+ .vidioc_reqbufs = v4l2_m2m_ioctl_reqbufs,
+ .vidioc_querybuf = v4l2_m2m_ioctl_querybuf,
+ .vidioc_streamon = v4l2_m2m_ioctl_streamon,
+ .vidioc_streamoff = v4l2_m2m_ioctl_streamoff,
+ .vidioc_subscribe_event = v4l2_ctrl_subscribe_event,
+ .vidioc_unsubscribe_event = v4l2_event_unsubscribe,
+};
+
+/* dma complete callback */
+static void asrc_input_dma_callback(void *data)
+{
+ struct fsl_asrc_pair *pair = (struct fsl_asrc_pair *)data;
+
+ complete(&pair->complete[V4L_OUT]);
+}
+
+/* dma complete callback */
+static void asrc_output_dma_callback(void *data)
+{
+ struct fsl_asrc_pair *pair = (struct fsl_asrc_pair *)data;
+
+ complete(&pair->complete[V4L_CAP]);
+}
+
+/* config dma channel */
+static int asrc_dmaconfig(struct asrc_pair_m2m *pair_m2m,
+ struct dma_chan *chan,
+ u32 dma_addr, dma_addr_t buf_addr, u32 buf_len,
+ int dir, int width)
+{
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct fsl_asrc *asrc = pair->asrc;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct device *dev = &m2m->pdev->dev;
+ struct dma_slave_config slave_config;
+ enum dma_slave_buswidth buswidth;
+ unsigned int sg_len, max_period_size;
+ struct scatterlist *sg;
+ int ret, i;
+
+ switch (width) {
+ case 8:
+ buswidth = DMA_SLAVE_BUSWIDTH_1_BYTE;
+ break;
+ case 16:
+ buswidth = DMA_SLAVE_BUSWIDTH_2_BYTES;
+ break;
+ case 24:
+ buswidth = DMA_SLAVE_BUSWIDTH_3_BYTES;
+ break;
+ case 32:
+ buswidth = DMA_SLAVE_BUSWIDTH_4_BYTES;
+ break;
+ default:
+ dev_err(dev, "invalid word width\n");
+ return -EINVAL;
+ }
+
+ memset(&slave_config, 0, sizeof(slave_config));
+ if (dir == V4L_OUT) {
+ slave_config.direction = DMA_MEM_TO_DEV;
+ slave_config.dst_addr = dma_addr;
+ slave_config.dst_addr_width = buswidth;
+ slave_config.dst_maxburst = asrc->m2m_get_maxburst(IN, pair);
+ } else {
+ slave_config.direction = DMA_DEV_TO_MEM;
+ slave_config.src_addr = dma_addr;
+ slave_config.src_addr_width = buswidth;
+ slave_config.src_maxburst = asrc->m2m_get_maxburst(OUT, pair);
+ }
+
+ ret = dmaengine_slave_config(chan, &slave_config);
+ if (ret) {
+ dev_err(dev, "failed to config dmaengine for %s task: %d\n",
+ DIR_STR(dir), ret);
+ return -EINVAL;
+ }
+
+ max_period_size = rounddown(ASRC_M2M_PERIOD_SIZE, width * pair->channels / 8);
+ /* scatter gather mode */
+ sg_len = buf_len / max_period_size;
+ if (buf_len % max_period_size)
+ sg_len += 1;
+
+ sg = kmalloc_array(sg_len, sizeof(*sg), GFP_KERNEL);
+ if (!sg)
+ return -ENOMEM;
+
+ sg_init_table(sg, sg_len);
+ for (i = 0; i < (sg_len - 1); i++) {
+ sg_dma_address(&sg[i]) = buf_addr + i * max_period_size;
+ sg_dma_len(&sg[i]) = max_period_size;
+ }
+ sg_dma_address(&sg[i]) = buf_addr + i * max_period_size;
+ sg_dma_len(&sg[i]) = buf_len - i * max_period_size;
+
+ pair->desc[dir] = dmaengine_prep_slave_sg(chan, sg, sg_len,
+ slave_config.direction,
+ DMA_PREP_INTERRUPT);
+ kfree(sg);
+ if (!pair->desc[dir]) {
+ dev_err(dev, "failed to prepare dmaengine for %s task\n", DIR_STR(dir));
+ return -EINVAL;
+ }
+
+ pair->desc[dir]->callback = ASRC_xPUT_DMA_CALLBACK(dir);
+ pair->desc[dir]->callback_param = pair;
+
+ return 0;
+}
+
+static void asrc_m2m_set_ratio_mod(struct asrc_pair_m2m *pair_m2m)
+{
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct fsl_asrc *asrc = pair->asrc;
+ s32 src_rate_int, dst_rate_int;
+ s64 src_rate_frac;
+ s64 dst_rate_frac;
+ u64 src_rate, dst_rate;
+ u64 ratio_pre, ratio_cur;
+ s64 ratio_diff;
+
+ if (!asrc->m2m_set_ratio_mod)
+ return;
+
+ if (pair_m2m->src_rate_off_cur == pair_m2m->src_rate_off_prev &&
+ pair_m2m->dst_rate_off_cur == pair_m2m->dst_rate_off_prev)
+ return;
+
+ /*
+ * use maximum rate 768kHz as limitation, then we can shift right 21 bit for
+ * division
+ */
+ src_rate_int = pair->rate[V4L_OUT];
+ src_rate_frac = pair_m2m->src_rate_off_prev;
+
+ src_rate = ((s64)src_rate_int << 32) + src_rate_frac;
+
+ dst_rate_int = pair->rate[V4L_CAP];
+ dst_rate_frac = pair_m2m->dst_rate_off_prev;
+
+ dst_rate = ((s64)dst_rate_int << 32) + dst_rate_frac;
+ dst_rate >>= 21;
+ do_div(src_rate, dst_rate);
+ ratio_pre = src_rate;
+
+ src_rate_frac = pair_m2m->src_rate_off_cur;
+ src_rate = ((s64)src_rate_int << 32) + src_rate_frac;
+
+ dst_rate_frac = pair_m2m->dst_rate_off_cur;
+ dst_rate = ((s64)dst_rate_int << 32) + dst_rate_frac;
+ dst_rate >>= 21;
+ do_div(src_rate, dst_rate);
+ ratio_cur = src_rate;
+
+ ratio_diff = ratio_cur - ratio_pre;
+ asrc->m2m_set_ratio_mod(pair, ratio_diff << 10);
+
+ pair_m2m->src_rate_off_prev = pair_m2m->src_rate_off_cur;
+ pair_m2m->dst_rate_off_prev = pair_m2m->dst_rate_off_cur;
+}
+
+/* main function of converter */
+static void asrc_m2m_device_run(void *priv)
+{
+ struct asrc_pair_m2m *pair_m2m = priv;
+ struct fsl_asrc_pair *pair = pair_m2m->pair;
+ struct asrc_m2m *m2m = pair_m2m->m2m;
+ struct fsl_asrc *asrc = pair->asrc;
+ struct device *dev = &m2m->pdev->dev;
+ enum asrc_pair_index index = pair->index;
+ struct vb2_v4l2_buffer *src_buf, *dst_buf;
+ unsigned int out_buf_len;
+ unsigned int cap_dma_len;
+ unsigned int width;
+ u32 fifo_addr;
+ int ret;
+
+ /* set ratio mod */
+ asrc_m2m_set_ratio_mod(pair_m2m);
+
+ src_buf = v4l2_m2m_next_src_buf(pair_m2m->fh.m2m_ctx);
+ dst_buf = v4l2_m2m_next_dst_buf(pair_m2m->fh.m2m_ctx);
+
+ src_buf->sequence = pair_m2m->sequence[V4L_OUT]++;
+ dst_buf->sequence = pair_m2m->sequence[V4L_CAP]++;
+
+ width = snd_pcm_format_physical_width(pair->sample_format[V4L_OUT]);
+ fifo_addr = asrc->paddr + asrc->get_fifo_addr(IN, index);
+ out_buf_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
+ if (out_buf_len < width * pair->channels / 8 ||
+ out_buf_len > ASRC_M2M_BUFFER_SIZE ||
+ out_buf_len % (width * pair->channels / 8)) {
+ dev_err(dev, "out buffer size is error: [%d]\n", out_buf_len);
+ goto end;
+ }
+
+ /* dma config for output dma channel */
+ ret = asrc_dmaconfig(pair_m2m,
+ pair->dma_chan[V4L_OUT],
+ fifo_addr,
+ vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0),
+ out_buf_len, V4L_OUT, width);
+ if (ret) {
+ dev_err(dev, "out dma config error\n");
+ goto end;
+ }
+
+ width = snd_pcm_format_physical_width(pair->sample_format[V4L_CAP]);
+ fifo_addr = asrc->paddr + asrc->get_fifo_addr(OUT, index);
+ cap_dma_len = asrc->m2m_calc_out_len(pair, out_buf_len);
+ if (cap_dma_len > 0 && cap_dma_len <= ASRC_M2M_BUFFER_SIZE) {
+ /* dma config for capture dma channel */
+ ret = asrc_dmaconfig(pair_m2m,
+ pair->dma_chan[V4L_CAP],
+ fifo_addr,
+ vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0),
+ cap_dma_len, V4L_CAP, width);
+ if (ret) {
+ dev_err(dev, "cap dma config error\n");
+ goto end;
+ }
+ } else if (cap_dma_len > ASRC_M2M_BUFFER_SIZE) {
+ dev_err(dev, "cap buffer size error\n");
+ goto end;
+ }
+
+ reinit_completion(&pair->complete[V4L_OUT]);
+ reinit_completion(&pair->complete[V4L_CAP]);
+
+ /* Submit DMA request */
+ dmaengine_submit(pair->desc[V4L_OUT]);
+ dma_async_issue_pending(pair->desc[V4L_OUT]->chan);
+ if (cap_dma_len > 0) {
+ dmaengine_submit(pair->desc[V4L_CAP]);
+ dma_async_issue_pending(pair->desc[V4L_CAP]->chan);
+ }
+
+ asrc->m2m_start(pair);
+
+ if (!wait_for_completion_interruptible_timeout(&pair->complete[V4L_OUT], 10 * HZ)) {
+ dev_err(dev, "out DMA task timeout\n");
+ goto end;
+ }
+
+ if (cap_dma_len > 0) {
+ if (!wait_for_completion_interruptible_timeout(&pair->complete[V4L_CAP], 10 * HZ)) {
+ dev_err(dev, "cap DMA task timeout\n");
+ goto end;
+ }
+ }
+
+ /* read the last words from FIFO */
+ asrc_read_last_fifo(pair, vb2_plane_vaddr(&dst_buf->vb2_buf, 0), &cap_dma_len);
+ /* update payload length for capture */
+ vb2_set_plane_payload(&dst_buf->vb2_buf, 0, cap_dma_len);
+
+end:
+ src_buf = v4l2_m2m_src_buf_remove(pair_m2m->fh.m2m_ctx);
+ dst_buf = v4l2_m2m_dst_buf_remove(pair_m2m->fh.m2m_ctx);
+
+ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+ v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
+
+ v4l2_m2m_job_finish(m2m->m2m_dev, pair_m2m->fh.m2m_ctx);
+}
+
+static int asrc_m2m_job_ready(void *priv)
+{
+ struct asrc_pair_m2m *pair_m2m = priv;
+
+ if (v4l2_m2m_num_src_bufs_ready(pair_m2m->fh.m2m_ctx) > 0 &&
+ v4l2_m2m_num_dst_bufs_ready(pair_m2m->fh.m2m_ctx) > 0) {
+ return 1;
+ }
+
+ return 0;
+}
+
+static const struct v4l2_m2m_ops asrc_m2m_ops = {
+ .job_ready = asrc_m2m_job_ready,
+ .device_run = asrc_m2m_device_run,
+};
+
+static const struct media_device_ops asrc_m2m_media_ops = {
+ .req_validate = vb2_request_validate,
+ .req_queue = v4l2_m2m_request_queue,
+};
+
+static int asrc_m2m_probe(struct platform_device *pdev)
+{
+ struct fsl_asrc_m2m_pdata *data = pdev->dev.platform_data;
+ struct device *dev = &pdev->dev;
+ struct asrc_m2m *m2m;
+ int ret;
+
+ m2m = devm_kzalloc(dev, sizeof(*m2m), GFP_KERNEL);
+ if (!m2m)
+ return -ENOMEM;
+
+ m2m->pdata = *data;
+ m2m->pdev = pdev;
+
+ ret = v4l2_device_register(dev, &m2m->v4l2_dev);
+ if (ret) {
+ dev_err(dev, "failed to register v4l2 device\n");
+ goto err_register;
+ }
+
+ m2m->m2m_dev = v4l2_m2m_init(&asrc_m2m_ops);
+ if (IS_ERR(m2m->m2m_dev)) {
+ ret = PTR_ERR(m2m->m2m_dev);
+ dev_err_probe(dev, ret, "failed to register v4l2 device\n");
+ goto err_m2m;
+ }
+
+ m2m->dec_vdev = video_device_alloc();
+ if (!m2m->dec_vdev) {
+ ret = -ENOMEM;
+ goto err_vdev_alloc;
+ }
+
+ mutex_init(&m2m->mlock);
+
+ m2m->dec_vdev->fops = &asrc_m2m_fops;
+ m2m->dec_vdev->ioctl_ops = &asrc_m2m_ioctl_ops;
+ m2m->dec_vdev->minor = -1;
+ m2m->dec_vdev->release = video_device_release;
+ m2m->dec_vdev->lock = &m2m->mlock; /* lock for ioctl serialization */
+ m2m->dec_vdev->v4l2_dev = &m2m->v4l2_dev;
+ m2m->dec_vdev->vfl_dir = VFL_DIR_M2M;
+ m2m->dec_vdev->device_caps = V4L2_CAP_STREAMING | V4L2_CAP_AUDIO_M2M;
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+ m2m->mdev.dev = &pdev->dev;
+ strscpy(m2m->mdev.model, M2M_DRV_NAME, sizeof(m2m->mdev.model));
+ media_device_init(&m2m->mdev);
+ m2m->mdev.ops = &asrc_m2m_media_ops;
+ m2m->v4l2_dev.mdev = &m2m->mdev;
+#endif
+
+ ret = video_register_device(m2m->dec_vdev, VFL_TYPE_AUDIO, -1);
+ if (ret) {
+ dev_err_probe(dev, ret, "failed to register video device\n");
+ goto err_vdev_register;
+ }
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+ ret = v4l2_m2m_register_media_controller(m2m->m2m_dev, m2m->dec_vdev,
+ MEDIA_ENT_F_PROC_AUDIO_RESAMPLER);
+ if (ret) {
+ dev_err_probe(dev, ret, "Failed to init mem2mem media controller\n");
+ goto error_v4l2;
+ }
+
+ ret = media_device_register(&m2m->mdev);
+ if (ret) {
+ dev_err_probe(dev, ret, "Failed to register mem2mem media device\n");
+ goto error_m2m_mc;
+ }
+#endif
+
+ video_set_drvdata(m2m->dec_vdev, m2m);
+ platform_set_drvdata(pdev, m2m);
+ pm_runtime_enable(&pdev->dev);
+
+ return 0;
+
+#ifdef CONFIG_MEDIA_CONTROLLER
+error_m2m_mc:
+ v4l2_m2m_unregister_media_controller(m2m->m2m_dev);
+#endif
+error_v4l2:
+ video_unregister_device(m2m->dec_vdev);
+err_vdev_register:
+ video_device_release(m2m->dec_vdev);
+err_vdev_alloc:
+ v4l2_m2m_release(m2m->m2m_dev);
+err_m2m:
+ v4l2_device_unregister(&m2m->v4l2_dev);
+err_register:
+ return ret;
+}
+
+static void asrc_m2m_remove(struct platform_device *pdev)
+{
+ struct asrc_m2m *m2m = platform_get_drvdata(pdev);
+
+ pm_runtime_disable(&pdev->dev);
+#ifdef CONFIG_MEDIA_CONTROLLER
+ media_device_unregister(&m2m->mdev);
+ v4l2_m2m_unregister_media_controller(m2m->m2m_dev);
+#endif
+ video_unregister_device(m2m->dec_vdev);
+ video_device_release(m2m->dec_vdev);
+ v4l2_m2m_release(m2m->m2m_dev);
+ v4l2_device_unregister(&m2m->v4l2_dev);
+}
+
+#ifdef CONFIG_PM_SLEEP
+/* suspend callback for m2m */
+static int asrc_m2m_suspend(struct device *dev)
+{
+ struct asrc_m2m *m2m = dev_get_drvdata(dev);
+ struct fsl_asrc *asrc = m2m->pdata.asrc;
+ struct fsl_asrc_pair *pair;
+ unsigned long lock_flags;
+ int i;
+
+ for (i = 0; i < PAIR_CTX_NUM; i++) {
+ spin_lock_irqsave(&asrc->lock, lock_flags);
+ pair = asrc->pair[i];
+ if (!pair || !pair->req_pair) {
+ spin_unlock_irqrestore(&asrc->lock, lock_flags);
+ continue;
+ }
+ if (!completion_done(&pair->complete[V4L_OUT])) {
+ if (pair->dma_chan[V4L_OUT])
+ dmaengine_terminate_all(pair->dma_chan[V4L_OUT]);
+ asrc_input_dma_callback((void *)pair);
+ }
+ if (!completion_done(&pair->complete[V4L_CAP])) {
+ if (pair->dma_chan[V4L_CAP])
+ dmaengine_terminate_all(pair->dma_chan[V4L_CAP]);
+ asrc_output_dma_callback((void *)pair);
+ }
+
+ if (asrc->m2m_pair_suspend)
+ asrc->m2m_pair_suspend(pair);
+
+ spin_unlock_irqrestore(&asrc->lock, lock_flags);
+ }
+
+ return 0;
+}
+
+static int asrc_m2m_resume(struct device *dev)
+{
+ struct asrc_m2m *m2m = dev_get_drvdata(dev);
+ struct fsl_asrc *asrc = m2m->pdata.asrc;
+ struct fsl_asrc_pair *pair;
+ unsigned long lock_flags;
+ int i;
+
+ for (i = 0; i < PAIR_CTX_NUM; i++) {
+ spin_lock_irqsave(&asrc->lock, lock_flags);
+ pair = asrc->pair[i];
+ if (!pair || !pair->req_pair) {
+ spin_unlock_irqrestore(&asrc->lock, lock_flags);
+ continue;
+ }
+ if (asrc->m2m_pair_resume)
+ asrc->m2m_pair_resume(pair);
+
+ spin_unlock_irqrestore(&asrc->lock, lock_flags);
+ }
+
+ return 0;
+}
+#endif
+
+static const struct dev_pm_ops asrc_m2m_pm_ops = {
+ SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(asrc_m2m_suspend,
+ asrc_m2m_resume)
+};
+
+static const struct platform_device_id asrc_m2m_driver_ids[] __always_unused = {
+ { .name = M2M_DRV_NAME },
+ { },
+};
+MODULE_DEVICE_TABLE(platform, asrc_m2m_driver_ids);
+
+static struct platform_driver asrc_m2m_driver = {
+ .probe = asrc_m2m_probe,
+ .remove_new = asrc_m2m_remove,
+ .id_table = asrc_m2m_driver_ids,
+ .driver = {
+ .name = M2M_DRV_NAME,
+ .pm = &asrc_m2m_pm_ops,
+ },
+};
+module_platform_driver(asrc_m2m_driver);
+
+MODULE_DESCRIPTION("Freescale ASRC M2M driver");
+MODULE_LICENSE("GPL");
--
2.34.1


2024-03-19 08:17:26

by Shengjiu Wang

[permalink] [raw]
Subject: [PATCH v15 01/16] media: v4l2-ctrls: add support for fraction_bits

From: Hans Verkuil <[email protected]>

This adds support for the fraction_bits field, used with integer controls.
This allows fixed point formats to be described.

The fraction_bits field is only exposed through VIDIOC_QUERY_EXT_CTRL.

For a given signed two's complement Qf fixed point value 'f' equals
fraction_bits.

Signed-off-by: Hans Verkuil <[email protected]>
---
.../media/v4l/vidioc-queryctrl.rst | 11 ++-
drivers/media/v4l2-core/v4l2-ctrls-api.c | 1 +
drivers/media/v4l2-core/v4l2-ctrls-core.c | 93 +++++++++++++++----
include/media/v4l2-ctrls.h | 7 +-
include/uapi/linux/videodev2.h | 3 +-
5 files changed, 95 insertions(+), 20 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index 4d38acafe8e1..e65c7e5d78ec 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -267,7 +267,16 @@ See also the examples in :ref:`control`.
- The size of each dimension. The first ``nr_of_dims`` elements of
this array must be non-zero, all remaining elements must be zero.
* - __u32
- - ``reserved``\ [32]
+ - ``fraction_bits``
+ - The number of least significant bits of the control value that
+ form the fraction of the fixed point value. This is 0 if the value
+ is a regular integer. This can be used with all integer control types
+ (``INTEGER``, ``INTEGER64``, ``U8``, ``U16`` and ``U32``).
+ For the signed types the signed two's complement representation is used.
+ This field applies to the control value as well as the ``minimum``,
+ ``maximum``, ``step`` and ``default_value`` fields.
+ * - __u32
+ - ``reserved``\ [31]
- Reserved for future extensions. Applications and drivers must set
the array to zero.

diff --git a/drivers/media/v4l2-core/v4l2-ctrls-api.c b/drivers/media/v4l2-core/v4l2-ctrls-api.c
index d9a422017bd9..ef16b00421ec 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-api.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-api.c
@@ -1101,6 +1101,7 @@ int v4l2_query_ext_ctrl(struct v4l2_ctrl_handler *hdl, struct v4l2_query_ext_ctr
qc->elems = ctrl->elems;
qc->nr_of_dims = ctrl->nr_of_dims;
memcpy(qc->dims, ctrl->dims, qc->nr_of_dims * sizeof(qc->dims[0]));
+ qc->fraction_bits = ctrl->fraction_bits;
qc->minimum = ctrl->minimum;
qc->maximum = ctrl->maximum;
qc->default_value = ctrl->default_value;
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c
index c4d995f32191..d83a37198bb5 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
@@ -252,12 +252,61 @@ void v4l2_ctrl_type_op_init(const struct v4l2_ctrl *ctrl, u32 from_idx,
}
EXPORT_SYMBOL(v4l2_ctrl_type_op_init);

+static void v4l2_ctrl_log_fp(s64 v, unsigned int fraction_bits)
+{
+ s64 i, f, mask;
+
+ if (!fraction_bits) {
+ pr_cont("%lld", v);
+ return;
+ }
+
+ mask = (1ULL << fraction_bits) - 1;
+
+ /*
+ * Note: this function does not support fixed point u64 with
+ * fraction_bits set to 64. At the moment there is no U64
+ * control type, but if that is added, then this code will have
+ * to add support for it.
+ */
+ if (fraction_bits >= 63)
+ i = v < 0 ? -1 : 0;
+ else
+ i = div64_s64(v, 1LL << fraction_bits);
+
+ f = v < 0 ? -((-v) & mask) : (v & mask);
+
+ if (!f) {
+ pr_cont("%lld", i);
+ } else if (fraction_bits < 20) {
+ u64 div = 1ULL << fraction_bits;
+
+ if (!i && f < 0)
+ pr_cont("-%lld/%llu", -f, div);
+ else if (!i)
+ pr_cont("%lld/%llu", f, div);
+ else if (i < 0 || f < 0)
+ pr_cont("-%lld-%llu/%llu", -i, -f, div);
+ else
+ pr_cont("%lld+%llu/%llu", i, f, div);
+ } else {
+ if (!i && f < 0)
+ pr_cont("-%lld/(2^%u)", -f, fraction_bits);
+ else if (!i)
+ pr_cont("%lld/(2^%u)", f, fraction_bits);
+ else if (i < 0 || f < 0)
+ pr_cont("-%lld-%llu/(2^%u)", -i, -f, fraction_bits);
+ else
+ pr_cont("%lld+%llu/(2^%u)", i, f, fraction_bits);
+ }
+}
+
void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)
{
union v4l2_ctrl_ptr ptr = ctrl->p_cur;

if (ctrl->is_array) {
- unsigned i;
+ unsigned int i;

for (i = 0; i < ctrl->nr_of_dims; i++)
pr_cont("[%u]", ctrl->dims[i]);
@@ -266,7 +315,7 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)

switch (ctrl->type) {
case V4L2_CTRL_TYPE_INTEGER:
- pr_cont("%d", *ptr.p_s32);
+ v4l2_ctrl_log_fp(*ptr.p_s32, ctrl->fraction_bits);
break;
case V4L2_CTRL_TYPE_BOOLEAN:
pr_cont("%s", *ptr.p_s32 ? "true" : "false");
@@ -281,19 +330,19 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)
pr_cont("0x%08x", *ptr.p_s32);
break;
case V4L2_CTRL_TYPE_INTEGER64:
- pr_cont("%lld", *ptr.p_s64);
+ v4l2_ctrl_log_fp(*ptr.p_s64, ctrl->fraction_bits);
break;
case V4L2_CTRL_TYPE_STRING:
pr_cont("%s", ptr.p_char);
break;
case V4L2_CTRL_TYPE_U8:
- pr_cont("%u", (unsigned)*ptr.p_u8);
+ v4l2_ctrl_log_fp(*ptr.p_u8, ctrl->fraction_bits);
break;
case V4L2_CTRL_TYPE_U16:
- pr_cont("%u", (unsigned)*ptr.p_u16);
+ v4l2_ctrl_log_fp(*ptr.p_u16, ctrl->fraction_bits);
break;
case V4L2_CTRL_TYPE_U32:
- pr_cont("%u", (unsigned)*ptr.p_u32);
+ v4l2_ctrl_log_fp(*ptr.p_u32, ctrl->fraction_bits);
break;
case V4L2_CTRL_TYPE_H264_SPS:
pr_cont("H264_SPS");
@@ -1753,11 +1802,12 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
u32 id, const char *name, enum v4l2_ctrl_type type,
s64 min, s64 max, u64 step, s64 def,
const u32 dims[V4L2_CTRL_MAX_DIMS], u32 elem_size,
- u32 flags, const char * const *qmenu,
+ u32 fraction_bits, u32 flags, const char * const *qmenu,
const s64 *qmenu_int, const union v4l2_ctrl_ptr p_def,
void *priv)
{
struct v4l2_ctrl *ctrl;
+ unsigned int max_fraction_bits = 0;
unsigned sz_extra;
unsigned nr_of_dims = 0;
unsigned elems = 1;
@@ -1779,20 +1829,28 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,

/* Prefill elem_size for all types handled by std_type_ops */
switch ((u32)type) {
+ case V4L2_CTRL_TYPE_INTEGER:
+ elem_size = sizeof(s32);
+ max_fraction_bits = 31;
+ break;
case V4L2_CTRL_TYPE_INTEGER64:
elem_size = sizeof(s64);
+ max_fraction_bits = 63;
break;
case V4L2_CTRL_TYPE_STRING:
elem_size = max + 1;
break;
case V4L2_CTRL_TYPE_U8:
elem_size = sizeof(u8);
+ max_fraction_bits = 8;
break;
case V4L2_CTRL_TYPE_U16:
elem_size = sizeof(u16);
+ max_fraction_bits = 16;
break;
case V4L2_CTRL_TYPE_U32:
elem_size = sizeof(u32);
+ max_fraction_bits = 32;
break;
case V4L2_CTRL_TYPE_MPEG2_SEQUENCE:
elem_size = sizeof(struct v4l2_ctrl_mpeg2_sequence);
@@ -1876,10 +1934,10 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
}

/* Sanity checks */
- if (id == 0 || name == NULL || !elem_size ||
- id >= V4L2_CID_PRIVATE_BASE ||
- (type == V4L2_CTRL_TYPE_MENU && qmenu == NULL) ||
- (type == V4L2_CTRL_TYPE_INTEGER_MENU && qmenu_int == NULL)) {
+ if (id == 0 || !name || !elem_size ||
+ fraction_bits > max_fraction_bits || id >= V4L2_CID_PRIVATE_BASE ||
+ (type == V4L2_CTRL_TYPE_MENU && !qmenu) ||
+ (type == V4L2_CTRL_TYPE_INTEGER_MENU && !qmenu_int)) {
handler_set_err(hdl, -ERANGE);
return NULL;
}
@@ -1940,6 +1998,7 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
ctrl->name = name;
ctrl->type = type;
ctrl->flags = flags;
+ ctrl->fraction_bits = fraction_bits;
ctrl->minimum = min;
ctrl->maximum = max;
ctrl->step = step;
@@ -2038,7 +2097,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_custom(struct v4l2_ctrl_handler *hdl,
ctrl = v4l2_ctrl_new(hdl, cfg->ops, cfg->type_ops, cfg->id, name,
type, min, max,
is_menu ? cfg->menu_skip_mask : step, def,
- cfg->dims, cfg->elem_size,
+ cfg->dims, cfg->elem_size, cfg->fraction_bits,
flags, qmenu, qmenu_int, cfg->p_def, priv);
if (ctrl)
ctrl->is_private = cfg->is_private;
@@ -2063,7 +2122,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_std(struct v4l2_ctrl_handler *hdl,
return NULL;
}
return v4l2_ctrl_new(hdl, ops, NULL, id, name, type,
- min, max, step, def, NULL, 0,
+ min, max, step, def, NULL, 0, 0,
flags, NULL, NULL, ptr_null, NULL);
}
EXPORT_SYMBOL(v4l2_ctrl_new_std);
@@ -2096,7 +2155,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_std_menu(struct v4l2_ctrl_handler *hdl,
return NULL;
}
return v4l2_ctrl_new(hdl, ops, NULL, id, name, type,
- 0, max, mask, def, NULL, 0,
+ 0, max, mask, def, NULL, 0, 0,
flags, qmenu, qmenu_int, ptr_null, NULL);
}
EXPORT_SYMBOL(v4l2_ctrl_new_std_menu);
@@ -2128,7 +2187,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_std_menu_items(struct v4l2_ctrl_handler *hdl,
return NULL;
}
return v4l2_ctrl_new(hdl, ops, NULL, id, name, type,
- 0, max, mask, def, NULL, 0,
+ 0, max, mask, def, NULL, 0, 0,
flags, qmenu, NULL, ptr_null, NULL);

}
@@ -2150,7 +2209,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_std_compound(struct v4l2_ctrl_handler *hdl,
return NULL;
}
return v4l2_ctrl_new(hdl, ops, NULL, id, name, type,
- min, max, step, def, NULL, 0,
+ min, max, step, def, NULL, 0, 0,
flags, NULL, NULL, p_def, NULL);
}
EXPORT_SYMBOL(v4l2_ctrl_new_std_compound);
@@ -2174,7 +2233,7 @@ struct v4l2_ctrl *v4l2_ctrl_new_int_menu(struct v4l2_ctrl_handler *hdl,
return NULL;
}
return v4l2_ctrl_new(hdl, ops, NULL, id, name, type,
- 0, max, 0, def, NULL, 0,
+ 0, max, 0, def, NULL, 0, 0,
flags, NULL, qmenu_int, ptr_null, NULL);
}
EXPORT_SYMBOL(v4l2_ctrl_new_int_menu);
diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h
index 59679a42b3e7..c35514c5bf88 100644
--- a/include/media/v4l2-ctrls.h
+++ b/include/media/v4l2-ctrls.h
@@ -211,7 +211,8 @@ typedef void (*v4l2_ctrl_notify_fnc)(struct v4l2_ctrl *ctrl, void *priv);
* except for dynamic arrays. In that case it is in the range of
* 1 to @p_array_alloc_elems.
* @dims: The size of each dimension.
- * @nr_of_dims:The number of dimensions in @dims.
+ * @nr_of_dims: The number of dimensions in @dims.
+ * @fraction_bits: The number of fraction bits for fixed point values.
* @menu_skip_mask: The control's skip mask for menu controls. This makes it
* easy to skip menu items that are not valid. If bit X is set,
* then menu item X is skipped. Of course, this only works for
@@ -228,6 +229,7 @@ typedef void (*v4l2_ctrl_notify_fnc)(struct v4l2_ctrl *ctrl, void *priv);
* :math:`ceil(\frac{maximum - minimum}{step}) + 1`.
* Used only if the @type is %V4L2_CTRL_TYPE_INTEGER_MENU.
* @flags: The control's flags.
+ * @fraction_bits: The number of fraction bits for fixed point values.
* @priv: The control's private pointer. For use by the driver. It is
* untouched by the control framework. Note that this pointer is
* not freed when the control is deleted. Should this be needed
@@ -286,6 +288,7 @@ struct v4l2_ctrl {
u32 new_elems;
u32 dims[V4L2_CTRL_MAX_DIMS];
u32 nr_of_dims;
+ u32 fraction_bits;
union {
u64 step;
u64 menu_skip_mask;
@@ -426,6 +429,7 @@ struct v4l2_ctrl_handler {
* @dims: The size of each dimension.
* @elem_size: The size in bytes of the control.
* @flags: The control's flags.
+ * @fraction_bits: The number of fraction bits for fixed point values.
* @menu_skip_mask: The control's skip mask for menu controls. This makes it
* easy to skip menu items that are not valid. If bit X is set,
* then menu item X is skipped. Of course, this only works for
@@ -455,6 +459,7 @@ struct v4l2_ctrl_config {
u32 dims[V4L2_CTRL_MAX_DIMS];
u32 elem_size;
u32 flags;
+ u32 fraction_bits;
u64 menu_skip_mask;
const char * const *qmenu;
const s64 *qmenu_int;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index a8015e5e7fa4..b8573e9ccde6 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1947,7 +1947,8 @@ struct v4l2_query_ext_ctrl {
__u32 elems;
__u32 nr_of_dims;
__u32 dims[V4L2_CTRL_MAX_DIMS];
- __u32 reserved[32];
+ __u32 fraction_bits;
+ __u32 reserved[31];
};

/* Used in the VIDIOC_QUERYMENU ioctl for querying menu items */
--
2.34.1


2024-04-30 08:21:27

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Hey Shengjiu,

first of all thanks for all of this work and I am very sorry for only
emerging this late into the series, I sadly didn't notice it earlier.

I would like to voice a few concerns about the general idea of adding
Audio support to the Media subsystem.

1. The biggest objection is, that the Linux Kernel has a subsystem
specifically targeted for audio devices, adding support for these
devices in another subsystem are counterproductive as they work around
the shortcomings of the audio subsystem while forcing support for a
device into a subsystem that was never designed for such devices.
Instead, the audio subsystem has to be adjusted to be able to support
all of the required workflows, otherwise, the next audio driver with
similar requirements will have to move to the media subsystem as well,
the audio subsystem would then never experience the required change and
soon we would have two audio subsystems.

2. Closely connected to the previous objection, the media subsystem with
its current staff of maintainers is overworked and barely capable of
handling the workload, which includes an abundance of different devices
from DVB, codecs, cameras, PCI devices, radio tuners, HDMI CEC, IR
receivers, etc. Adding more device types to this matrix will make the
situation worse and should only be done with a plan for how first to
improve the current maintainer situation.

3. By using the same framework and APIs as the video codecs, the audio
codecs are going to cause extra work for the video codec developers and
maintainers simply by occupying the same space that was orginally
designed for the purpose of video only. Even if you try to not cause any
extra stress the simple presence of the audio code in the codebase is
going to cause restrictions.

The main issue here is that the audio subsystem doesn't provide a
mem2mem framework and I would say you are in luck because the media
subsystem has gathered a lot of shortcomings with its current
implementation of the mem2mem framework over time, which is why a new
implementation will be necessary anyway.

So instead of hammering a driver into the wrong destination, I would
suggest bundling our forces and implementing a general memory-to-memory
framework that both the media and the audio subsystem can use, that
addresses the current shortcomings of the implementation and allows you
to upload the driver where it is supposed to be.
This is going to cause restrictions as well, like mentioned in the
concern number 3, but with the difference that we can make a general
plan for such a framework that accomodates lots of use cases and each
subsystem can add their routines on top of the general framework.

Another possible alternative is to try and make the DRM scheduler more
generally available, this scheduler is the most mature and in fact is
very similar to what you and what the media devices need.
Which again just shows how common your usecase actually is and how a
general solution is the best long term solution.

Please notice that Daniel Almeida is currently working on something
related to this:
https://lore.kernel.org/linux-media/[email protected]/T/#u

If the toplevel maintainers decide to add the patchset so be it, but I
wanted to voice my concerns and also highlight that this is likely going
to cause extra stress for the video codecs maintainers and the
maintainers in general. We cannot spend a lot of time on audio codecs,
as video codecs already fill up our available time sufficiently,
so the use of the framework needs to be conservative and cause as little
extra work as possible for the original use case of the framework.

Regards,
Sebastian

On 19.03.2024 15:50, Shengjiu Wang wrote:
>Audio signal processing also has the requirement for memory to
>memory similar as Video.
>
>This asrc memory to memory (memory ->asrc->memory) case is a non
>real time use case.
>
>User fills the input buffer to the asrc module, after conversion, then asrc
>sends back the output buffer to user. So it is not a traditional ALSA playback
>and capture case.
>
>It is a specific use case, there is no reference in current kernel.
>v4l2 memory to memory is the closed implementation, v4l2 current
>support video, image, radio, tuner, touch devices, so it is not
>complicated to add support for this specific audio case.
>
>Because we had implemented the "memory -> asrc ->i2s device-> codec"
>use case in ALSA. Now the "memory->asrc->memory" needs
>to reuse the code in asrc driver, so the first 3 patches is for refining
>the code to make it can be shared by the "memory->asrc->memory"
>driver.
>
>The main change is in the v4l2 side, A /dev/vl4-audioX will be created,
>user applications only use the ioctl of v4l2 framework.
>
>Other change is to add memory to memory support for two kinds of i.MX ASRC
>module.
>
>changes in v15:
>- update MAINTAINERS for imx-asrc.c and vim2m-audio.c
>
>changes in v14:
>- document the reservation of 'AUXX' fourcc format.
>- add v4l2_audfmt_to_fourcc() definition.
>
>changes in v13
>- change 'pixelformat' to 'audioformat' in dev-audio-mem2mem.rst
>- add more description for clock drift in ext-ctrls-audio-m2m.rst
>- Add "media: v4l2-ctrls: add support for fraction_bits" from Hans
> to avoid build issue for kernel test robot
>
>changes in v12
>- minor changes according to comments
>- drop min_buffers_needed = 1 and V4L2_CTRL_FLAG_UPDATE flag
>- drop bus_info
>
>changes in v11
>- add add-fixed-point-test-controls in vivid.
>- add v4l2_ctrl_fp_compose() helper function for min and max
>
>changes in v10
>- remove FIXED_POINT type
>- change code base on media: v4l2-ctrls: add support for fraction_bits
>- fix issue reported by kernel test robot
>- remove module_alias
>
>changes in v9:
>- add MEDIA_ENT_F_PROC_AUDIO_RESAMPLER.
>- add MEDIA_INTF_T_V4L_AUDIO
>- add media controller support
>- refine the vim2m-audio to support 8k<->16k conversion.
>
>changes in v8:
>- refine V4L2_CAP_AUDIO_M2M to be 0x00000008
>- update doc for FIXED_POINT
>- address comments for imx-asrc
>
>changes in v7:
>- add acked-by from Mark
>- separate commit for fixed point, m2m audio class, audio rate controls
>- use INTEGER_MENU for rate, FIXED_POINT for rate offset
>- remove used fmts
>- address other comments for Hans
>
>changes in v6:
>- use m2m_prepare/m2m_unprepare/m2m_start/m2m_stop to replace
> m2m_start_part_one/m2m_stop_part_one, m2m_start_part_two/m2m_stop_part_two.
>- change V4L2_CTRL_TYPE_ASRC_RATE to V4L2_CTRL_TYPE_FIXED_POINT
>- fix warning by kernel test rebot
>- remove some unused format V4L2_AUDIO_FMT_XX
>- Get SNDRV_PCM_FORMAT from V4L2_AUDIO_FMT in driver.
>- rename audm2m to viaudm2m.
>
>changes in v5:
>- remove V4L2_AUDIO_FMT_LPCM
>- define audio pixel format like V4L2_AUDIO_FMT_S8...
>- remove rate and format in struct v4l2_audio_format.
>- Add V4L2_CID_ASRC_SOURCE_RATE and V4L2_CID_ASRC_DEST_RATE controls
>- updata document accordingly.
>
>changes in v4:
>- update document style
>- separate V4L2_AUDIO_FMT_LPCM and V4L2_CAP_AUDIO_M2M in separate commit
>
>changes in v3:
>- Modify documents for adding audio m2m support
>- Add audio virtual m2m driver
>- Defined V4L2_AUDIO_FMT_LPCM format type for audio.
>- Defined V4L2_CAP_AUDIO_M2M capability type for audio m2m case.
>- with modification in v4l-utils, pass v4l2-compliance test.
>
>changes in v2:
>- decouple the implementation in v4l2 and ALSA
>- implement the memory to memory driver as a platfrom driver
> and move it to driver/media
>- move fsl_asrc_common.h to include/sound folder
>
>Hans Verkuil (1):
> media: v4l2-ctrls: add support for fraction_bits
>
>Shengjiu Wang (15):
> ASoC: fsl_asrc: define functions for memory to memory usage
> ASoC: fsl_easrc: define functions for memory to memory usage
> ASoC: fsl_asrc: move fsl_asrc_common.h to include/sound
> ASoC: fsl_asrc: register m2m platform device
> ASoC: fsl_easrc: register m2m platform device
> media: uapi: Add V4L2_CAP_AUDIO_M2M capability flag
> media: v4l2: Add audio capture and output support
> media: uapi: Define audio sample format fourcc type
> media: uapi: Add V4L2_CTRL_CLASS_M2M_AUDIO
> media: uapi: Add audio rate controls support
> media: uapi: Declare interface types for Audio
> media: uapi: Add an entity type for audio resampler
> media: vivid: add fixed point test controls
> media: imx-asrc: Add memory to memory driver
> media: vim2m-audio: add virtual driver for audio memory to memory
>
> .../media/mediactl/media-types.rst | 11 +
> .../userspace-api/media/v4l/buffer.rst | 6 +
> .../userspace-api/media/v4l/common.rst | 1 +
> .../media/v4l/dev-audio-mem2mem.rst | 71 +
> .../userspace-api/media/v4l/devices.rst | 1 +
> .../media/v4l/ext-ctrls-audio-m2m.rst | 59 +
> .../userspace-api/media/v4l/pixfmt-audio.rst | 100 ++
> .../userspace-api/media/v4l/pixfmt.rst | 1 +
> .../media/v4l/vidioc-enum-fmt.rst | 2 +
> .../media/v4l/vidioc-g-ext-ctrls.rst | 4 +
> .../userspace-api/media/v4l/vidioc-g-fmt.rst | 4 +
> .../media/v4l/vidioc-querycap.rst | 3 +
> .../media/v4l/vidioc-queryctrl.rst | 11 +-
> .../media/videodev2.h.rst.exceptions | 3 +
> MAINTAINERS | 17 +
> .../media/common/videobuf2/videobuf2-v4l2.c | 4 +
> drivers/media/platform/nxp/Kconfig | 13 +
> drivers/media/platform/nxp/Makefile | 1 +
> drivers/media/platform/nxp/imx-asrc.c | 1256 +++++++++++++++++
> drivers/media/test-drivers/Kconfig | 10 +
> drivers/media/test-drivers/Makefile | 1 +
> drivers/media/test-drivers/vim2m-audio.c | 793 +++++++++++
> drivers/media/test-drivers/vivid/vivid-core.h | 2 +
> .../media/test-drivers/vivid/vivid-ctrls.c | 26 +
> drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 9 +
> drivers/media/v4l2-core/v4l2-ctrls-api.c | 1 +
> drivers/media/v4l2-core/v4l2-ctrls-core.c | 93 +-
> drivers/media/v4l2-core/v4l2-ctrls-defs.c | 10 +
> drivers/media/v4l2-core/v4l2-dev.c | 21 +
> drivers/media/v4l2-core/v4l2-ioctl.c | 66 +
> drivers/media/v4l2-core/v4l2-mem2mem.c | 13 +-
> include/media/v4l2-ctrls.h | 13 +-
> include/media/v4l2-dev.h | 2 +
> include/media/v4l2-ioctl.h | 34 +
> .../fsl => include/sound}/fsl_asrc_common.h | 60 +
> include/uapi/linux/media.h | 2 +
> include/uapi/linux/v4l2-controls.h | 9 +
> include/uapi/linux/videodev2.h | 50 +-
> sound/soc/fsl/fsl_asrc.c | 144 ++
> sound/soc/fsl/fsl_asrc.h | 4 +-
> sound/soc/fsl/fsl_asrc_dma.c | 2 +-
> sound/soc/fsl/fsl_easrc.c | 233 +++
> sound/soc/fsl/fsl_easrc.h | 6 +-
> 43 files changed, 3145 insertions(+), 27 deletions(-)
> create mode 100644 Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst
> create mode 100644 Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
> create mode 100644 Documentation/userspace-api/media/v4l/pixfmt-audio.rst
> create mode 100644 drivers/media/platform/nxp/imx-asrc.c
> create mode 100644 drivers/media/test-drivers/vim2m-audio.c
> rename {sound/soc/fsl => include/sound}/fsl_asrc_common.h (60%)
>
>--
>2.34.1
>
>

2024-04-30 08:47:35

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 30/04/2024 10:21, Sebastian Fricke wrote:
> Hey Shengjiu,
>
> first of all thanks for all of this work and I am very sorry for only
> emerging this late into the series, I sadly didn't notice it earlier.
>
> I would like to voice a few concerns about the general idea of adding
> Audio support to the Media subsystem.
>
> 1. The biggest objection is, that the Linux Kernel has a subsystem
> specifically targeted for audio devices, adding support for these
> devices in another subsystem are counterproductive as they work around
> the shortcomings of the audio subsystem while forcing support for a
> device into a subsystem that was never designed for such devices.
> Instead, the audio subsystem has to be adjusted to be able to support
> all of the required workflows, otherwise, the next audio driver with
> similar requirements will have to move to the media subsystem as well,
> the audio subsystem would then never experience the required change and
> soon we would have two audio subsystems.
>
> 2. Closely connected to the previous objection, the media subsystem with
> its current staff of maintainers is overworked and barely capable of
> handling the workload, which includes an abundance of different devices
> from DVB, codecs, cameras, PCI devices, radio tuners, HDMI CEC, IR
> receivers, etc. Adding more device types to this matrix will make the
> situation worse and should only be done with a plan for how first to
> improve the current maintainer situation.
>
> 3. By using the same framework and APIs as the video codecs, the audio
> codecs are going to cause extra work for the video codec developers and
> maintainers simply by occupying the same space that was orginally
> designed for the purpose of video only. Even if you try to not cause any
> extra stress the simple presence of the audio code in the codebase is
> going to cause restrictions.
>
> The main issue here is that the audio subsystem doesn't provide a
> mem2mem framework and I would say you are in luck because the media
> subsystem has gathered a lot of shortcomings with its current
> implementation of the mem2mem framework over time, which is why a new
> implementation will be necessary anyway.
>
> So instead of hammering a driver into the wrong destination, I would
> suggest bundling our forces and implementing a general memory-to-memory
> framework that both the media and the audio subsystem can use, that
> addresses the current shortcomings of the implementation and allows you
> to upload the driver where it is supposed to be.
> This is going to cause restrictions as well, like mentioned in the
> concern number 3, but with the difference that we can make a general
> plan for such a framework that accomodates lots of use cases and each
> subsystem can add their routines on top of the general framework.
>
> Another possible alternative is to try and make the DRM scheduler more
> generally available, this scheduler is the most mature and in fact is
> very similar to what you and what the media devices need.
> Which again just shows how common your usecase actually is and how a
> general solution is the best long term solution.
>
> Please notice that Daniel Almeida is currently working on something
> related to this:
> https://lore.kernel.org/linux-media/[email protected]/T/#u
>
> If the toplevel maintainers decide to add the patchset so be it, but I
> wanted to voice my concerns and also highlight that this is likely going
> to cause extra stress for the video codecs maintainers and the
> maintainers in general. We cannot spend a lot of time on audio codecs,
> as video codecs already fill up our available time sufficiently,
> so the use of the framework needs to be conservative and cause as little
> extra work as possible for the original use case of the framework.

I would really like to get the input of the audio maintainers on this.
Sebastian has a good point, especially with us being overworked :-)

Having a shared mem2mem framework would certainly be nice, on the other
hand, developing that will most likely take a substantial amount of time.

Perhaps it is possible to copy the current media v4l2-mem2mem.c and turn
it into an alsa-mem2mem.c? I really do not know enough about the alsa
subsystem to tell if that is possible.

While this driver is a rate converter, not an audio codec, the same
principles would apply to off-line audio codecs as well. And it is true
that we definitely do not want to support audio codecs in the media
subsystem.

Accepting this driver creates a precedent and would open the door for
audio codecs.

I may have been too hasty in saying yes to this, I did not consider
the wider implications for our workload and what it can lead to. I
sincerely apologize to Shengjiu Wang as it is no fun to end up in a
situation like this.

Regards,

Hans

2024-04-30 13:52:55

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Em Tue, 30 Apr 2024 10:47:13 +0200
Hans Verkuil <[email protected]> escreveu:

> On 30/04/2024 10:21, Sebastian Fricke wrote:
> > Hey Shengjiu,
> >
> > first of all thanks for all of this work and I am very sorry for only
> > emerging this late into the series, I sadly didn't notice it earlier.
> >
> > I would like to voice a few concerns about the general idea of adding
> > Audio support to the Media subsystem.
> >
> > 1. The biggest objection is, that the Linux Kernel has a subsystem
> > specifically targeted for audio devices, adding support for these
> > devices in another subsystem are counterproductive as they work around
> > the shortcomings of the audio subsystem while forcing support for a
> > device into a subsystem that was never designed for such devices.
> > Instead, the audio subsystem has to be adjusted to be able to support
> > all of the required workflows, otherwise, the next audio driver with
> > similar requirements will have to move to the media subsystem as well,
> > the audio subsystem would then never experience the required change and
> > soon we would have two audio subsystems.
> >
> > 2. Closely connected to the previous objection, the media subsystem with
> > its current staff of maintainers is overworked and barely capable of
> > handling the workload, which includes an abundance of different devices
> > from DVB, codecs, cameras, PCI devices, radio tuners, HDMI CEC, IR
> > receivers, etc. Adding more device types to this matrix will make the
> > situation worse and should only be done with a plan for how first to
> > improve the current maintainer situation.
> >
> > 3. By using the same framework and APIs as the video codecs, the audio
> > codecs are going to cause extra work for the video codec developers and
> > maintainers simply by occupying the same space that was orginally
> > designed for the purpose of video only. Even if you try to not cause any
> > extra stress the simple presence of the audio code in the codebase is
> > going to cause restrictions.
> >
> > The main issue here is that the audio subsystem doesn't provide a
> > mem2mem framework and I would say you are in luck because the media
> > subsystem has gathered a lot of shortcomings with its current
> > implementation of the mem2mem framework over time, which is why a new
> > implementation will be necessary anyway.
> >
> > So instead of hammering a driver into the wrong destination, I would
> > suggest bundling our forces and implementing a general memory-to-memory
> > framework that both the media and the audio subsystem can use, that
> > addresses the current shortcomings of the implementation and allows you
> > to upload the driver where it is supposed to be.
> > This is going to cause restrictions as well, like mentioned in the
> > concern number 3, but with the difference that we can make a general
> > plan for such a framework that accomodates lots of use cases and each
> > subsystem can add their routines on top of the general framework.
> >
> > Another possible alternative is to try and make the DRM scheduler more
> > generally available, this scheduler is the most mature and in fact is
> > very similar to what you and what the media devices need.
> > Which again just shows how common your usecase actually is and how a
> > general solution is the best long term solution.
> >
> > Please notice that Daniel Almeida is currently working on something
> > related to this:
> > https://lore.kernel.org/linux-media/[email protected]/T/#u
> >
> > If the toplevel maintainers decide to add the patchset so be it, but I
> > wanted to voice my concerns and also highlight that this is likely going
> > to cause extra stress for the video codecs maintainers and the
> > maintainers in general. We cannot spend a lot of time on audio codecs,
> > as video codecs already fill up our available time sufficiently,
> > so the use of the framework needs to be conservative and cause as little
> > extra work as possible for the original use case of the framework.
>
> I would really like to get the input of the audio maintainers on this.
> Sebastian has a good point, especially with us being overworked :-)
>
> Having a shared mem2mem framework would certainly be nice, on the other
> hand, developing that will most likely take a substantial amount of time.
>
> Perhaps it is possible to copy the current media v4l2-mem2mem.c and turn
> it into an alsa-mem2mem.c? I really do not know enough about the alsa
> subsystem to tell if that is possible.
>
> While this driver is a rate converter, not an audio codec, the same
> principles would apply to off-line audio codecs as well. And it is true
> that we definitely do not want to support audio codecs in the media
> subsystem.
>
> Accepting this driver creates a precedent and would open the door for
> audio codecs.
>
> I may have been too hasty in saying yes to this, I did not consider
> the wider implications for our workload and what it can lead to. I
> sincerely apologize to Shengjiu Wang as it is no fun to end up in a
> situation like this.

I agree with both Sebastian and Hans here: media devices always had
audio streams, even on old PCI analog TV devices like bttv. There
are even some devices like the ones based on usb em28xx that contains
an AC97 chip on it. The decision was always to have audio supported by
ALSA APIs/subsystem, as otherwise we'll end duplicating code and
reinventing the wheel with new incompatible APIs for audio in and outside
media, creating unneeded complexity, which will end being reflected on
userspace as well.

So, IMO it makes a lot more sense to place audio codecs and processor
blocks inside ALSA, probably as part of ALSA SOF, if possible.

Hans suggestion of forking v4l2-mem2mem.c on ALSA seems a good
starting point. Also, moving the DRM mem2mem functionality to a
core library that could be re-used by the three subsystems sounds
a good idea, but I suspect that a change like that could be more
time-consuming.

Regards,
Mauro

2024-04-30 14:47:33

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:

> first of all thanks for all of this work and I am very sorry for only
> emerging this late into the series, I sadly didn't notice it earlier.

It might be worth checking out the discussion on earlier versions...

> 1. The biggest objection is, that the Linux Kernel has a subsystem
> specifically targeted for audio devices, adding support for these
> devices in another subsystem are counterproductive as they work around
> the shortcomings of the audio subsystem while forcing support for a
> device into a subsystem that was never designed for such devices.
> Instead, the audio subsystem has to be adjusted to be able to support
> all of the required workflows, otherwise, the next audio driver with
> similar requirements will have to move to the media subsystem as well,
> the audio subsystem would then never experience the required change and
> soon we would have two audio subsystems.

The discussion around this originally was that all the audio APIs are
very much centered around real time operations rather than completely
async memory to memory operations and that it's not clear that it's
worth reinventing the wheel simply for the sake of having things in
ALSA when that's already pretty idiomatic for the media subsystem. It
wasn't the memory to memory bit per se, it was the disconnection from
any timing.

> So instead of hammering a driver into the wrong destination, I would
> suggest bundling our forces and implementing a general memory-to-memory
> framework that both the media and the audio subsystem can use, that
> addresses the current shortcomings of the implementation and allows you
> to upload the driver where it is supposed to be.

That doesn't sound like an immediate solution to maintainer overload
issues... if something like this is going to happen the DRM solution
does seem more general but I'm not sure the amount of stop energy is
proportionate.


Attachments:
(No filename) (1.95 kB)
signature.asc (499.00 B)
Download all attachments

2024-04-30 15:41:10

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 30. 04. 24 16:46, Mark Brown wrote:

>> So instead of hammering a driver into the wrong destination, I would
>> suggest bundling our forces and implementing a general memory-to-memory
>> framework that both the media and the audio subsystem can use, that
>> addresses the current shortcomings of the implementation and allows you
>> to upload the driver where it is supposed to be.
>
> That doesn't sound like an immediate solution to maintainer overload
> issues... if something like this is going to happen the DRM solution
> does seem more general but I'm not sure the amount of stop energy is
> proportionate.

The "do what you want" ALSA's hwdep device / interface can be used to transfer
data in/out from SRC using custom read/write/ioctl/mmap syscalls. The question
is, if the changes cannot be more simpler for the first implementation keeping
the hardware enumeration in one subsystem where is the driver code placed. I
also see the benefit to reuse the already existing framework (but is v4l2 the
right one?).

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-04-30 16:28:17

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Em Tue, 30 Apr 2024 23:46:03 +0900
Mark Brown <[email protected]> escreveu:

> On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:
>
> > first of all thanks for all of this work and I am very sorry for only
> > emerging this late into the series, I sadly didn't notice it earlier.
>
> It might be worth checking out the discussion on earlier versions...
>
> > 1. The biggest objection is, that the Linux Kernel has a subsystem
> > specifically targeted for audio devices, adding support for these
> > devices in another subsystem are counterproductive as they work around
> > the shortcomings of the audio subsystem while forcing support for a
> > device into a subsystem that was never designed for such devices.
> > Instead, the audio subsystem has to be adjusted to be able to support
> > all of the required workflows, otherwise, the next audio driver with
> > similar requirements will have to move to the media subsystem as well,
> > the audio subsystem would then never experience the required change and
> > soon we would have two audio subsystems.
>
> The discussion around this originally was that all the audio APIs are
> very much centered around real time operations rather than completely
> async memory to memory operations and that it's not clear that it's
> worth reinventing the wheel simply for the sake of having things in
> ALSA when that's already pretty idiomatic for the media subsystem. It
> wasn't the memory to memory bit per se, it was the disconnection from
> any timing.

The media subsystem is also centered around real time. Without real
time, you can't have a decent video conference system. Having
mem2mem transfers actually help reducing real time delays, as it
avoids extra latency due to CPU congestion and/or data transfers
from/to userspace.

>
> > So instead of hammering a driver into the wrong destination, I would
> > suggest bundling our forces and implementing a general memory-to-memory
> > framework that both the media and the audio subsystem can use, that
> > addresses the current shortcomings of the implementation and allows you
> > to upload the driver where it is supposed to be.
>
> That doesn't sound like an immediate solution to maintainer overload
> issues... if something like this is going to happen the DRM solution
> does seem more general but I'm not sure the amount of stop energy is
> proportionate.

I don't think maintainer overload is the issue here. The main
point is to avoid a fork at the audio uAPI, plus the burden
of re-inventing the wheel with new codes for audio formats,
new documentation for them, etc.

Regards,
Mauro

2024-05-01 01:56:33

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Tue, Apr 30, 2024 at 05:27:52PM +0100, Mauro Carvalho Chehab wrote:
> Mark Brown <[email protected]> escreveu:
> > On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:

> > The discussion around this originally was that all the audio APIs are
> > very much centered around real time operations rather than completely

> The media subsystem is also centered around real time. Without real
> time, you can't have a decent video conference system. Having
> mem2mem transfers actually help reducing real time delays, as it
> avoids extra latency due to CPU congestion and/or data transfers
> from/to userspace.

Real time means strongly tied to wall clock times rather than fast - the
issue was that all the ALSA APIs are based around pushing data through
the system based on a clock.

> > That doesn't sound like an immediate solution to maintainer overload
> > issues... if something like this is going to happen the DRM solution
> > does seem more general but I'm not sure the amount of stop energy is
> > proportionate.

> I don't think maintainer overload is the issue here. The main
> point is to avoid a fork at the audio uAPI, plus the burden
> of re-inventing the wheel with new codes for audio formats,
> new documentation for them, etc.

I thought that discussion had been had already at one of the earlier
versions? TBH I've not really been paying attention to this since the
very early versions where I raised some similar "why is this in media"
points and I thought everyone had decided that this did actually make
sense.


Attachments:
(No filename) (1.55 kB)
signature.asc (499.00 B)
Download all attachments

2024-05-02 07:46:21

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Wed, 01 May 2024 03:56:15 +0200,
Mark Brown wrote:
>
> On Tue, Apr 30, 2024 at 05:27:52PM +0100, Mauro Carvalho Chehab wrote:
> > Mark Brown <[email protected]> escreveu:
> > > On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:
>
> > > The discussion around this originally was that all the audio APIs are
> > > very much centered around real time operations rather than completely
>
> > The media subsystem is also centered around real time. Without real
> > time, you can't have a decent video conference system. Having
> > mem2mem transfers actually help reducing real time delays, as it
> > avoids extra latency due to CPU congestion and/or data transfers
> > from/to userspace.
>
> Real time means strongly tied to wall clock times rather than fast - the
> issue was that all the ALSA APIs are based around pushing data through
> the system based on a clock.
>
> > > That doesn't sound like an immediate solution to maintainer overload
> > > issues... if something like this is going to happen the DRM solution
> > > does seem more general but I'm not sure the amount of stop energy is
> > > proportionate.
>
> > I don't think maintainer overload is the issue here. The main
> > point is to avoid a fork at the audio uAPI, plus the burden
> > of re-inventing the wheel with new codes for audio formats,
> > new documentation for them, etc.
>
> I thought that discussion had been had already at one of the earlier
> versions? TBH I've not really been paying attention to this since the
> very early versions where I raised some similar "why is this in media"
> points and I thought everyone had decided that this did actually make
> sense.

Yeah, it was discussed in v1 and v2 threads, e.g.
https://patchwork.kernel.org/project/linux-media/cover/[email protected]/#25485573

My argument at that time was how the operation would be, and the point
was that it'd be a "batch-like" operation via M2M without any timing
control. It'd be a very special usage for for ALSA, and if any, it'd
be hwdep -- that is a very hardware-specific API implementation -- or
try compress-offload API, which looks dubious.

OTOH, the argument was that there is already a framework for M2M in
media API and that also fits for the batch-like operation, too. So
was the thread evolved until now.


thanks,

Takashi

2024-05-02 09:00:22

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Em Thu, 02 May 2024 09:46:14 +0200
Takashi Iwai <[email protected]> escreveu:

> On Wed, 01 May 2024 03:56:15 +0200,
> Mark Brown wrote:
> >
> > On Tue, Apr 30, 2024 at 05:27:52PM +0100, Mauro Carvalho Chehab wrote:
> > > Mark Brown <[email protected]> escreveu:
> > > > On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:
> >
> > > > The discussion around this originally was that all the audio APIs are
> > > > very much centered around real time operations rather than completely
> >
> > > The media subsystem is also centered around real time. Without real
> > > time, you can't have a decent video conference system. Having
> > > mem2mem transfers actually help reducing real time delays, as it
> > > avoids extra latency due to CPU congestion and/or data transfers
> > > from/to userspace.
> >
> > Real time means strongly tied to wall clock times rather than fast - the
> > issue was that all the ALSA APIs are based around pushing data through
> > the system based on a clock.
> >
> > > > That doesn't sound like an immediate solution to maintainer overload
> > > > issues... if something like this is going to happen the DRM solution
> > > > does seem more general but I'm not sure the amount of stop energy is
> > > > proportionate.
> >
> > > I don't think maintainer overload is the issue here. The main
> > > point is to avoid a fork at the audio uAPI, plus the burden
> > > of re-inventing the wheel with new codes for audio formats,
> > > new documentation for them, etc.
> >
> > I thought that discussion had been had already at one of the earlier
> > versions? TBH I've not really been paying attention to this since the
> > very early versions where I raised some similar "why is this in media"
> > points and I thought everyone had decided that this did actually make
> > sense.
>
> Yeah, it was discussed in v1 and v2 threads, e.g.
> https://patchwork.kernel.org/project/linux-media/cover/[email protected]/#25485573
>
> My argument at that time was how the operation would be, and the point
> was that it'd be a "batch-like" operation via M2M without any timing
> control. It'd be a very special usage for for ALSA, and if any, it'd
> be hwdep -- that is a very hardware-specific API implementation -- or
> try compress-offload API, which looks dubious.
>
> OTOH, the argument was that there is already a framework for M2M in
> media API and that also fits for the batch-like operation, too. So
> was the thread evolved until now.

M2M transfers are not a hardware-specific API, and such kind of
transfers is not new either. Old media devices like bttv have
internally a way to do PCI2PCI transfers, allowing media streams
to be transferred directly without utilizing CPU. The media driver
supports it for video, as this made a huge difference of performance
back then.

On embedded world, this is a pretty common scenario: different media
IP blocks can communicate with each other directly via memory. This
can happen for video capture, video display and audio.

With M2M, most of the control is offloaded to the hardware.

There are still time control associated with it, as audio and video
needs to be in sync. This is done by controlling the buffers size
and could be fine-tuned by checking when the buffer transfer is done.

On media, M2M buffer transfers are started via VIDIOC_QBUF,
which is a request to do a frame transfer. A similar ioctl
(VIDIOC_DQBUF) is used to monitor when the hardware finishes
transfering the buffer. On other words, the CPU is responsible
for time control.

On other words, this is still real time. The main difference
from a "sync" transfer is that the CPU doesn't need to copy data
from/to different devices, as such operation is offloaded to the
hardware.

Regards,
Mauro

2024-05-02 09:27:40

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Em Thu, 2 May 2024 09:59:56 +0100
Mauro Carvalho Chehab <[email protected]> escreveu:

> Em Thu, 02 May 2024 09:46:14 +0200
> Takashi Iwai <[email protected]> escreveu:
>
> > On Wed, 01 May 2024 03:56:15 +0200,
> > Mark Brown wrote:
> > >
> > > On Tue, Apr 30, 2024 at 05:27:52PM +0100, Mauro Carvalho Chehab wrote:
> > > > Mark Brown <[email protected]> escreveu:
> > > > > On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:
> > >
> > > > > The discussion around this originally was that all the audio APIs are
> > > > > very much centered around real time operations rather than completely
> > >
> > > > The media subsystem is also centered around real time. Without real
> > > > time, you can't have a decent video conference system. Having
> > > > mem2mem transfers actually help reducing real time delays, as it
> > > > avoids extra latency due to CPU congestion and/or data transfers
> > > > from/to userspace.
> > >
> > > Real time means strongly tied to wall clock times rather than fast - the
> > > issue was that all the ALSA APIs are based around pushing data through
> > > the system based on a clock.
> > >
> > > > > That doesn't sound like an immediate solution to maintainer overload
> > > > > issues... if something like this is going to happen the DRM solution
> > > > > does seem more general but I'm not sure the amount of stop energy is
> > > > > proportionate.
> > >
> > > > I don't think maintainer overload is the issue here. The main
> > > > point is to avoid a fork at the audio uAPI, plus the burden
> > > > of re-inventing the wheel with new codes for audio formats,
> > > > new documentation for them, etc.
> > >
> > > I thought that discussion had been had already at one of the earlier
> > > versions? TBH I've not really been paying attention to this since the
> > > very early versions where I raised some similar "why is this in media"
> > > points and I thought everyone had decided that this did actually make
> > > sense.
> >
> > Yeah, it was discussed in v1 and v2 threads, e.g.
> > https://patchwork.kernel.org/project/linux-media/cover/[email protected]/#25485573
> >
> > My argument at that time was how the operation would be, and the point
> > was that it'd be a "batch-like" operation via M2M without any timing
> > control. It'd be a very special usage for for ALSA, and if any, it'd
> > be hwdep -- that is a very hardware-specific API implementation -- or
> > try compress-offload API, which looks dubious.
> >
> > OTOH, the argument was that there is already a framework for M2M in
> > media API and that also fits for the batch-like operation, too. So
> > was the thread evolved until now.
>
> M2M transfers are not a hardware-specific API, and such kind of
> transfers is not new either. Old media devices like bttv have
> internally a way to do PCI2PCI transfers, allowing media streams
> to be transferred directly without utilizing CPU. The media driver
> supports it for video, as this made a huge difference of performance
> back then.
>
> On embedded world, this is a pretty common scenario: different media
> IP blocks can communicate with each other directly via memory. This
> can happen for video capture, video display and audio.
>
> With M2M, most of the control is offloaded to the hardware.
>
> There are still time control associated with it, as audio and video
> needs to be in sync. This is done by controlling the buffers size
> and could be fine-tuned by checking when the buffer transfer is done.
>
> On media, M2M buffer transfers are started via VIDIOC_QBUF,
> which is a request to do a frame transfer. A similar ioctl
> (VIDIOC_DQBUF) is used to monitor when the hardware finishes
> transfering the buffer. On other words, the CPU is responsible
> for time control.

Just complementing: on media, we do this per video buffer (or
per half video buffer). A typical use case on cameras is to have
buffers transferred 30 times per second, if the video was streamed
at 30 frames per second.

I would assume that, on an audio/video stream, the audio data
transfer will be programmed to also happen on a regular interval.

So, if the video stream is programmed to a 30 frames per second
rate, I would assume that the associated audio stream will also be
programmed to be grouped into 30 data transfers per second. On such
scenario, if the audio is sampled at 48 kHZ, it means that:

1) each M2M transfer commanded by CPU will copy 1600 samples;
2) the time between each sample will remain 1/48000;
3) a notification event telling that 1600 samples were transferred
will be generated when the last sample happens;
4) CPU will do time control by looking at the notification events.

> On other words, this is still real time. The main difference
> from a "sync" transfer is that the CPU doesn't need to copy data
> from/to different devices, as such operation is offloaded to the
> hardware.
>
> Regards,
> Mauro

2024-05-03 01:47:32

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> Mauro Carvalho Chehab <[email protected]> escreveu:

> > There are still time control associated with it, as audio and video
> > needs to be in sync. This is done by controlling the buffers size
> > and could be fine-tuned by checking when the buffer transfer is done.

...

> Just complementing: on media, we do this per video buffer (or
> per half video buffer). A typical use case on cameras is to have
> buffers transferred 30 times per second, if the video was streamed
> at 30 frames per second.

IIRC some big use case for this hardware was transcoding so there was a
desire to just go at whatever rate the hardware could support as there
is no interactive user consuming the output as it is generated.

> I would assume that, on an audio/video stream, the audio data
> transfer will be programmed to also happen on a regular interval.

With audio the API is very much "wake userspace every Xms".


Attachments:
(No filename) (0.98 kB)
signature.asc (499.00 B)
Download all attachments

2024-05-03 08:43:25

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Em Fri, 3 May 2024 10:47:19 +0900
Mark Brown <[email protected]> escreveu:

> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> > Mauro Carvalho Chehab <[email protected]> escreveu:
>
> > > There are still time control associated with it, as audio and video
> > > needs to be in sync. This is done by controlling the buffers size
> > > and could be fine-tuned by checking when the buffer transfer is done.
>
> ...
>
> > Just complementing: on media, we do this per video buffer (or
> > per half video buffer). A typical use case on cameras is to have
> > buffers transferred 30 times per second, if the video was streamed
> > at 30 frames per second.
>
> IIRC some big use case for this hardware was transcoding so there was a
> desire to just go at whatever rate the hardware could support as there
> is no interactive user consuming the output as it is generated.

Indeed, codecs could be used to just do transcoding, but I would
expect it to be a border use case. See, as the chipsets implementing
codecs are typically the ones used on mobiles, I would expect that
the major use cases to be to watch audio and video and to participate
on audio/video conferences.

Going further, the codec API may end supporting not only transcoding
(which is something that CPU can usually handle without too much
processing) but also audio processing that may require more
complex algorithms - even deep learning ones - like background noise
removal, echo detection/removal, volume auto-gain, audio enhancement
and such.

On other words, the typical use cases will either have input
or output being a physical hardware (microphone or speaker).

> > I would assume that, on an audio/video stream, the audio data
> > transfer will be programmed to also happen on a regular interval.
>
> With audio the API is very much "wake userspace every Xms".

2024-05-06 08:49:57

by Shengjiu Wang

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
>
> Em Fri, 3 May 2024 10:47:19 +0900
> Mark Brown <[email protected]> escreveu:
>
> > On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> > > Mauro Carvalho Chehab <[email protected]> escreveu:
> >
> > > > There are still time control associated with it, as audio and video
> > > > needs to be in sync. This is done by controlling the buffers size
> > > > and could be fine-tuned by checking when the buffer transfer is done.
> >
> > ...
> >
> > > Just complementing: on media, we do this per video buffer (or
> > > per half video buffer). A typical use case on cameras is to have
> > > buffers transferred 30 times per second, if the video was streamed
> > > at 30 frames per second.
> >
> > IIRC some big use case for this hardware was transcoding so there was a
> > desire to just go at whatever rate the hardware could support as there
> > is no interactive user consuming the output as it is generated.
>
> Indeed, codecs could be used to just do transcoding, but I would
> expect it to be a border use case. See, as the chipsets implementing
> codecs are typically the ones used on mobiles, I would expect that
> the major use cases to be to watch audio and video and to participate
> on audio/video conferences.
>
> Going further, the codec API may end supporting not only transcoding
> (which is something that CPU can usually handle without too much
> processing) but also audio processing that may require more
> complex algorithms - even deep learning ones - like background noise
> removal, echo detection/removal, volume auto-gain, audio enhancement
> and such.
>
> On other words, the typical use cases will either have input
> or output being a physical hardware (microphone or speaker).
>

All, thanks for spending time to discuss, it seems we go back to
the start point of this topic again.

Our main request is that there is a hardware sample rate converter
on the chip, so users can use it in user space as a component like
software sample rate converter. It mostly may run as a gstreamer plugin.
so it is a memory to memory component.

I didn't find such API in ALSA for such purpose, the best option for this
in the kernel is the V4L2 memory to memory framework I found.
As Hans said it is well designed for memory to memory.

And I think audio is one of 'media'. As I can see that part of Radio
function is in ALSA, part of Radio function is in V4L2. part of HDMI
function is in DRM, part of HDMI function is in ALSA...
So using V4L2 for audio is not new from this point of view.

Even now I still think V4L2 is the best option, but it looks like there
are a lot of rejects. If develop a new ALSA-mem2mem, it is also
a duplication of code (bigger duplication that just add audio support
in V4L2 I think).

Best regards
Shengjiu Wang.

> > > I would assume that, on an audio/video stream, the audio data
> > > transfer will be programmed to also happen on a regular interval.
> >
> > With audio the API is very much "wake userspace every Xms".

2024-05-06 09:43:17

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 06. 05. 24 10:49, Shengjiu Wang wrote:

> Even now I still think V4L2 is the best option, but it looks like there
> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
> a duplication of code (bigger duplication that just add audio support
> in V4L2 I think).

Maybe not. Could you try to evaluate a pure dma-buf (drivers/dma-buf) solution
and add only enumeration and operation trigger mechanism to the ALSA API? It
seems that dma-buf has enough sufficient code to transfer data from and to the
kernel space for the further processing. I think that one buffer can be as
source and the second for the processed data.

We can eventually add new ioctls to the ALSA's control API (/dev/snd/control*)
for this purpose (DSP processing).

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-08 08:09:14

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 06/05/2024 10:49, Shengjiu Wang wrote:
> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
>>
>> Em Fri, 3 May 2024 10:47:19 +0900
>> Mark Brown <[email protected]> escreveu:
>>
>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
>>>
>>>>> There are still time control associated with it, as audio and video
>>>>> needs to be in sync. This is done by controlling the buffers size
>>>>> and could be fine-tuned by checking when the buffer transfer is done.
>>>
>>> ...
>>>
>>>> Just complementing: on media, we do this per video buffer (or
>>>> per half video buffer). A typical use case on cameras is to have
>>>> buffers transferred 30 times per second, if the video was streamed
>>>> at 30 frames per second.
>>>
>>> IIRC some big use case for this hardware was transcoding so there was a
>>> desire to just go at whatever rate the hardware could support as there
>>> is no interactive user consuming the output as it is generated.
>>
>> Indeed, codecs could be used to just do transcoding, but I would
>> expect it to be a border use case. See, as the chipsets implementing
>> codecs are typically the ones used on mobiles, I would expect that
>> the major use cases to be to watch audio and video and to participate
>> on audio/video conferences.
>>
>> Going further, the codec API may end supporting not only transcoding
>> (which is something that CPU can usually handle without too much
>> processing) but also audio processing that may require more
>> complex algorithms - even deep learning ones - like background noise
>> removal, echo detection/removal, volume auto-gain, audio enhancement
>> and such.
>>
>> On other words, the typical use cases will either have input
>> or output being a physical hardware (microphone or speaker).
>>
>
> All, thanks for spending time to discuss, it seems we go back to
> the start point of this topic again.
>
> Our main request is that there is a hardware sample rate converter
> on the chip, so users can use it in user space as a component like
> software sample rate converter. It mostly may run as a gstreamer plugin.
> so it is a memory to memory component.
>
> I didn't find such API in ALSA for such purpose, the best option for this
> in the kernel is the V4L2 memory to memory framework I found.
> As Hans said it is well designed for memory to memory.
>
> And I think audio is one of 'media'. As I can see that part of Radio
> function is in ALSA, part of Radio function is in V4L2. part of HDMI
> function is in DRM, part of HDMI function is in ALSA...
> So using V4L2 for audio is not new from this point of view.
>
> Even now I still think V4L2 is the best option, but it looks like there
> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
> a duplication of code (bigger duplication that just add audio support
> in V4L2 I think).

After reading this thread I still believe that the mem2mem framework is
a reasonable option, unless someone can come up with a method that is
easy to implement in the alsa subsystem. From what I can tell from this
discussion no such method exists.

From the media side there are arguments that it adds extra maintenance
load, which is true, but I believe that it is quite limited in practice.

That said, perhaps we should make a statement that while we support the
use of audio m2m drivers, this is only for simple m2m audio processing like
this driver, specifically where there is a 1-to-1 mapping between input and
output buffers. At this point we do not want to add audio codec support or
similar complex audio processing.

Part of the reason is that codecs are hard, and we already have our hands
full with all the video codecs. Part of the reason is that the v4l2-mem2mem
framework probably needs to be forked to make a more advanced version geared
towards codecs since the current framework is too limiting for some of the
things we want to do. It was really designed for scalers, deinterlacers, etc.
and the codec support was added later.

If we ever allow such complex audio processing devices, then we would have
to have another discussion, and I believe that will only be possible if
most of the maintenance load would be on the alsa subsystem where the audio
experts are.

So my proposal is to:

1) add a clear statement to dev-audio-mem2mem.rst (patch 08/16) that only
simple audio devices with a 1-to-1 mapping of input to output buffer are
supported. Perhaps also in videodev2.h before struct v4l2_audio_format.

2) I will experiment a bit trying to solve the main complaint about creating
new audio fourcc values and thus duplicating existing SNDRV_PCM_FORMAT_
values. I have some ideas for that.

But I do not want to spend time on 2 until we agree that this is the way
forward.

Regards,

Hans

2024-05-08 13:33:47

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 5/8/2024 10:00 AM, Hans Verkuil wrote:
> On 06/05/2024 10:49, Shengjiu Wang wrote:
>> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
>>>
>>> Em Fri, 3 May 2024 10:47:19 +0900
>>> Mark Brown <[email protected]> escreveu:
>>>
>>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
>>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
>>>>
>>>>>> There are still time control associated with it, as audio and video
>>>>>> needs to be in sync. This is done by controlling the buffers size
>>>>>> and could be fine-tuned by checking when the buffer transfer is done.
>>>>
>>>> ...
>>>>
>>>>> Just complementing: on media, we do this per video buffer (or
>>>>> per half video buffer). A typical use case on cameras is to have
>>>>> buffers transferred 30 times per second, if the video was streamed
>>>>> at 30 frames per second.
>>>>
>>>> IIRC some big use case for this hardware was transcoding so there was a
>>>> desire to just go at whatever rate the hardware could support as there
>>>> is no interactive user consuming the output as it is generated.
>>>
>>> Indeed, codecs could be used to just do transcoding, but I would
>>> expect it to be a border use case. See, as the chipsets implementing
>>> codecs are typically the ones used on mobiles, I would expect that
>>> the major use cases to be to watch audio and video and to participate
>>> on audio/video conferences.
>>>
>>> Going further, the codec API may end supporting not only transcoding
>>> (which is something that CPU can usually handle without too much
>>> processing) but also audio processing that may require more
>>> complex algorithms - even deep learning ones - like background noise
>>> removal, echo detection/removal, volume auto-gain, audio enhancement
>>> and such.
>>>
>>> On other words, the typical use cases will either have input
>>> or output being a physical hardware (microphone or speaker).
>>>
>>
>> All, thanks for spending time to discuss, it seems we go back to
>> the start point of this topic again.
>>
>> Our main request is that there is a hardware sample rate converter
>> on the chip, so users can use it in user space as a component like
>> software sample rate converter. It mostly may run as a gstreamer plugin.
>> so it is a memory to memory component.
>>
>> I didn't find such API in ALSA for such purpose, the best option for this
>> in the kernel is the V4L2 memory to memory framework I found.
>> As Hans said it is well designed for memory to memory.
>>
>> And I think audio is one of 'media'. As I can see that part of Radio
>> function is in ALSA, part of Radio function is in V4L2. part of HDMI
>> function is in DRM, part of HDMI function is in ALSA...
>> So using V4L2 for audio is not new from this point of view.
>>
>> Even now I still think V4L2 is the best option, but it looks like there
>> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
>> a duplication of code (bigger duplication that just add audio support
>> in V4L2 I think).
>
> After reading this thread I still believe that the mem2mem framework is
> a reasonable option, unless someone can come up with a method that is
> easy to implement in the alsa subsystem. From what I can tell from this
> discussion no such method exists.
>

Hi,

my main question would be how is mem2mem use case different from
loopback exposing playback and capture frontends in user space with DSP
(or other piece of HW) in the middle?

Amadeusz


2024-05-09 09:37:25

by Shengjiu Wang

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Wed, May 8, 2024 at 4:14 PM Amadeusz Sławiński
<[email protected]> wrote:
>
> On 5/8/2024 10:00 AM, Hans Verkuil wrote:
> > On 06/05/2024 10:49, Shengjiu Wang wrote:
> >> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
> >>>
> >>> Em Fri, 3 May 2024 10:47:19 +0900
> >>> Mark Brown <[email protected]> escreveu:
> >>>
> >>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> >>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
> >>>>
> >>>>>> There are still time control associated with it, as audio and video
> >>>>>> needs to be in sync. This is done by controlling the buffers size
> >>>>>> and could be fine-tuned by checking when the buffer transfer is done.
> >>>>
> >>>> ...
> >>>>
> >>>>> Just complementing: on media, we do this per video buffer (or
> >>>>> per half video buffer). A typical use case on cameras is to have
> >>>>> buffers transferred 30 times per second, if the video was streamed
> >>>>> at 30 frames per second.
> >>>>
> >>>> IIRC some big use case for this hardware was transcoding so there was a
> >>>> desire to just go at whatever rate the hardware could support as there
> >>>> is no interactive user consuming the output as it is generated.
> >>>
> >>> Indeed, codecs could be used to just do transcoding, but I would
> >>> expect it to be a border use case. See, as the chipsets implementing
> >>> codecs are typically the ones used on mobiles, I would expect that
> >>> the major use cases to be to watch audio and video and to participate
> >>> on audio/video conferences.
> >>>
> >>> Going further, the codec API may end supporting not only transcoding
> >>> (which is something that CPU can usually handle without too much
> >>> processing) but also audio processing that may require more
> >>> complex algorithms - even deep learning ones - like background noise
> >>> removal, echo detection/removal, volume auto-gain, audio enhancement
> >>> and such.
> >>>
> >>> On other words, the typical use cases will either have input
> >>> or output being a physical hardware (microphone or speaker).
> >>>
> >>
> >> All, thanks for spending time to discuss, it seems we go back to
> >> the start point of this topic again.
> >>
> >> Our main request is that there is a hardware sample rate converter
> >> on the chip, so users can use it in user space as a component like
> >> software sample rate converter. It mostly may run as a gstreamer plugin.
> >> so it is a memory to memory component.
> >>
> >> I didn't find such API in ALSA for such purpose, the best option for this
> >> in the kernel is the V4L2 memory to memory framework I found.
> >> As Hans said it is well designed for memory to memory.
> >>
> >> And I think audio is one of 'media'. As I can see that part of Radio
> >> function is in ALSA, part of Radio function is in V4L2. part of HDMI
> >> function is in DRM, part of HDMI function is in ALSA...
> >> So using V4L2 for audio is not new from this point of view.
> >>
> >> Even now I still think V4L2 is the best option, but it looks like there
> >> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
> >> a duplication of code (bigger duplication that just add audio support
> >> in V4L2 I think).
> >
> > After reading this thread I still believe that the mem2mem framework is
> > a reasonable option, unless someone can come up with a method that is
> > easy to implement in the alsa subsystem. From what I can tell from this
> > discussion no such method exists.
> >
>
> Hi,
>
> my main question would be how is mem2mem use case different from
> loopback exposing playback and capture frontends in user space with DSP
> (or other piece of HW) in the middle?
>
I think loopback has a timing control, user need to feed data to playback at a
fixed time and get data from capture at a fixed time. Otherwise there
is xrun in
playback and capture.

mem2mem case: there is no such timing control, user feeds data to it
then it generates output, if user doesn't feed data, there is no xrun.
but mem2mem is just one of the components in the playback or capture
pipeline, overall there is time control for whole pipeline,

Best regards
Shengjiu Wang

> Amadeusz
>

2024-05-09 10:12:42

by Shengjiu Wang

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Thu, May 9, 2024 at 5:50 PM Amadeusz Sławiński
<[email protected]> wrote:
>
> On 5/9/2024 11:36 AM, Shengjiu Wang wrote:
> > On Wed, May 8, 2024 at 4:14 PM Amadeusz Sławiński
> > <[email protected]> wrote:
> >>
> >> On 5/8/2024 10:00 AM, Hans Verkuil wrote:
> >>> On 06/05/2024 10:49, Shengjiu Wang wrote:
> >>>> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
> >>>>>
> >>>>> Em Fri, 3 May 2024 10:47:19 +0900
> >>>>> Mark Brown <[email protected]> escreveu:
> >>>>>
> >>>>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> >>>>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
> >>>>>>
> >>>>>>>> There are still time control associated with it, as audio and video
> >>>>>>>> needs to be in sync. This is done by controlling the buffers size
> >>>>>>>> and could be fine-tuned by checking when the buffer transfer is done.
> >>>>>>
> >>>>>> ...
> >>>>>>
> >>>>>>> Just complementing: on media, we do this per video buffer (or
> >>>>>>> per half video buffer). A typical use case on cameras is to have
> >>>>>>> buffers transferred 30 times per second, if the video was streamed
> >>>>>>> at 30 frames per second.
> >>>>>>
> >>>>>> IIRC some big use case for this hardware was transcoding so there was a
> >>>>>> desire to just go at whatever rate the hardware could support as there
> >>>>>> is no interactive user consuming the output as it is generated.
> >>>>>
> >>>>> Indeed, codecs could be used to just do transcoding, but I would
> >>>>> expect it to be a border use case. See, as the chipsets implementing
> >>>>> codecs are typically the ones used on mobiles, I would expect that
> >>>>> the major use cases to be to watch audio and video and to participate
> >>>>> on audio/video conferences.
> >>>>>
> >>>>> Going further, the codec API may end supporting not only transcoding
> >>>>> (which is something that CPU can usually handle without too much
> >>>>> processing) but also audio processing that may require more
> >>>>> complex algorithms - even deep learning ones - like background noise
> >>>>> removal, echo detection/removal, volume auto-gain, audio enhancement
> >>>>> and such.
> >>>>>
> >>>>> On other words, the typical use cases will either have input
> >>>>> or output being a physical hardware (microphone or speaker).
> >>>>>
> >>>>
> >>>> All, thanks for spending time to discuss, it seems we go back to
> >>>> the start point of this topic again.
> >>>>
> >>>> Our main request is that there is a hardware sample rate converter
> >>>> on the chip, so users can use it in user space as a component like
> >>>> software sample rate converter. It mostly may run as a gstreamer plugin.
> >>>> so it is a memory to memory component.
> >>>>
> >>>> I didn't find such API in ALSA for such purpose, the best option for this
> >>>> in the kernel is the V4L2 memory to memory framework I found.
> >>>> As Hans said it is well designed for memory to memory.
> >>>>
> >>>> And I think audio is one of 'media'. As I can see that part of Radio
> >>>> function is in ALSA, part of Radio function is in V4L2. part of HDMI
> >>>> function is in DRM, part of HDMI function is in ALSA...
> >>>> So using V4L2 for audio is not new from this point of view.
> >>>>
> >>>> Even now I still think V4L2 is the best option, but it looks like there
> >>>> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
> >>>> a duplication of code (bigger duplication that just add audio support
> >>>> in V4L2 I think).
> >>>
> >>> After reading this thread I still believe that the mem2mem framework is
> >>> a reasonable option, unless someone can come up with a method that is
> >>> easy to implement in the alsa subsystem. From what I can tell from this
> >>> discussion no such method exists.
> >>>
> >>
> >> Hi,
> >>
> >> my main question would be how is mem2mem use case different from
> >> loopback exposing playback and capture frontends in user space with DSP
> >> (or other piece of HW) in the middle?
> >>
> > I think loopback has a timing control, user need to feed data to playback at a
> > fixed time and get data from capture at a fixed time. Otherwise there
> > is xrun in
> > playback and capture.
> >
> > mem2mem case: there is no such timing control, user feeds data to it
> > then it generates output, if user doesn't feed data, there is no xrun.
> > but mem2mem is just one of the components in the playback or capture
> > pipeline, overall there is time control for whole pipeline,
> >
>
> Have you looked at compress streams? If I remember correctly they are
> not tied to time due to the fact that they can pass data in arbitrary
> formats?
>
> From:
> https://docs.kernel.org/sound/designs/compress-offload.html
>
> "No notion of underrun/overrun. Since the bytes written are compressed
> in nature and data written/read doesn’t translate directly to rendered
> output in time, this does not deal with underrun/overrun and maybe dealt
> in user-library"

I checked the compress stream. mem2mem case is different with
compress-offload case

compress-offload case is a full pipeline, the user sends a compress
stream to it, then DSP decodes it and renders it to the speaker in real
time.

mem2mem is just like the decoder in the compress pipeline. which is
one of the components in the pipeline.

best regards
shengjiu wang
>
> Amadeusz

2024-05-09 10:28:22

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 5/9/2024 12:12 PM, Shengjiu Wang wrote:
> On Thu, May 9, 2024 at 5:50 PM Amadeusz Sławiński
> <[email protected]> wrote:
>>
>> On 5/9/2024 11:36 AM, Shengjiu Wang wrote:
>>> On Wed, May 8, 2024 at 4:14 PM Amadeusz Sławiński
>>> <[email protected]> wrote:
>>>>
>>>> On 5/8/2024 10:00 AM, Hans Verkuil wrote:
>>>>> On 06/05/2024 10:49, Shengjiu Wang wrote:
>>>>>> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
>>>>>>>
>>>>>>> Em Fri, 3 May 2024 10:47:19 +0900
>>>>>>> Mark Brown <[email protected]> escreveu:
>>>>>>>
>>>>>>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
>>>>>>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
>>>>>>>>
>>>>>>>>>> There are still time control associated with it, as audio and video
>>>>>>>>>> needs to be in sync. This is done by controlling the buffers size
>>>>>>>>>> and could be fine-tuned by checking when the buffer transfer is done.
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>>> Just complementing: on media, we do this per video buffer (or
>>>>>>>>> per half video buffer). A typical use case on cameras is to have
>>>>>>>>> buffers transferred 30 times per second, if the video was streamed
>>>>>>>>> at 30 frames per second.
>>>>>>>>
>>>>>>>> IIRC some big use case for this hardware was transcoding so there was a
>>>>>>>> desire to just go at whatever rate the hardware could support as there
>>>>>>>> is no interactive user consuming the output as it is generated.
>>>>>>>
>>>>>>> Indeed, codecs could be used to just do transcoding, but I would
>>>>>>> expect it to be a border use case. See, as the chipsets implementing
>>>>>>> codecs are typically the ones used on mobiles, I would expect that
>>>>>>> the major use cases to be to watch audio and video and to participate
>>>>>>> on audio/video conferences.
>>>>>>>
>>>>>>> Going further, the codec API may end supporting not only transcoding
>>>>>>> (which is something that CPU can usually handle without too much
>>>>>>> processing) but also audio processing that may require more
>>>>>>> complex algorithms - even deep learning ones - like background noise
>>>>>>> removal, echo detection/removal, volume auto-gain, audio enhancement
>>>>>>> and such.
>>>>>>>
>>>>>>> On other words, the typical use cases will either have input
>>>>>>> or output being a physical hardware (microphone or speaker).
>>>>>>>
>>>>>>
>>>>>> All, thanks for spending time to discuss, it seems we go back to
>>>>>> the start point of this topic again.
>>>>>>
>>>>>> Our main request is that there is a hardware sample rate converter
>>>>>> on the chip, so users can use it in user space as a component like
>>>>>> software sample rate converter. It mostly may run as a gstreamer plugin.
>>>>>> so it is a memory to memory component.
>>>>>>
>>>>>> I didn't find such API in ALSA for such purpose, the best option for this
>>>>>> in the kernel is the V4L2 memory to memory framework I found.
>>>>>> As Hans said it is well designed for memory to memory.
>>>>>>
>>>>>> And I think audio is one of 'media'. As I can see that part of Radio
>>>>>> function is in ALSA, part of Radio function is in V4L2. part of HDMI
>>>>>> function is in DRM, part of HDMI function is in ALSA...
>>>>>> So using V4L2 for audio is not new from this point of view.
>>>>>>
>>>>>> Even now I still think V4L2 is the best option, but it looks like there
>>>>>> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
>>>>>> a duplication of code (bigger duplication that just add audio support
>>>>>> in V4L2 I think).
>>>>>
>>>>> After reading this thread I still believe that the mem2mem framework is
>>>>> a reasonable option, unless someone can come up with a method that is
>>>>> easy to implement in the alsa subsystem. From what I can tell from this
>>>>> discussion no such method exists.
>>>>>
>>>>
>>>> Hi,
>>>>
>>>> my main question would be how is mem2mem use case different from
>>>> loopback exposing playback and capture frontends in user space with DSP
>>>> (or other piece of HW) in the middle?
>>>>
>>> I think loopback has a timing control, user need to feed data to playback at a
>>> fixed time and get data from capture at a fixed time. Otherwise there
>>> is xrun in
>>> playback and capture.
>>>
>>> mem2mem case: there is no such timing control, user feeds data to it
>>> then it generates output, if user doesn't feed data, there is no xrun.
>>> but mem2mem is just one of the components in the playback or capture
>>> pipeline, overall there is time control for whole pipeline,
>>>
>>
>> Have you looked at compress streams? If I remember correctly they are
>> not tied to time due to the fact that they can pass data in arbitrary
>> formats?
>>
>> From:
>> https://docs.kernel.org/sound/designs/compress-offload.html
>>
>> "No notion of underrun/overrun. Since the bytes written are compressed
>> in nature and data written/read doesn’t translate directly to rendered
>> output in time, this does not deal with underrun/overrun and maybe dealt
>> in user-library"
>
> I checked the compress stream. mem2mem case is different with
> compress-offload case
>
> compress-offload case is a full pipeline, the user sends a compress
> stream to it, then DSP decodes it and renders it to the speaker in real
> time.
>
> mem2mem is just like the decoder in the compress pipeline. which is
> one of the components in the pipeline.

I was thinking of loopback with endpoints using compress streams,
without physical endpoint, something like:

compress playback (to feed data from userspace) -> DSP (processing) ->
compress capture (send data back to userspace)

Unless I'm missing something, you should be able to process data as fast
as you can feed it and consume it in such case.

Amadeusz

2024-05-09 10:44:29

by Shengjiu Wang

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Thu, May 9, 2024 at 6:28 PM Amadeusz Sławiński
<[email protected]> wrote:
>
> On 5/9/2024 12:12 PM, Shengjiu Wang wrote:
> > On Thu, May 9, 2024 at 5:50 PM Amadeusz Sławiński
> > <[email protected]> wrote:
> >>
> >> On 5/9/2024 11:36 AM, Shengjiu Wang wrote:
> >>> On Wed, May 8, 2024 at 4:14 PM Amadeusz Sławiński
> >>> <[email protected]> wrote:
> >>>>
> >>>> On 5/8/2024 10:00 AM, Hans Verkuil wrote:
> >>>>> On 06/05/2024 10:49, Shengjiu Wang wrote:
> >>>>>> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
> >>>>>>>
> >>>>>>> Em Fri, 3 May 2024 10:47:19 +0900
> >>>>>>> Mark Brown <[email protected]> escreveu:
> >>>>>>>
> >>>>>>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
> >>>>>>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
> >>>>>>>>
> >>>>>>>>>> There are still time control associated with it, as audio and video
> >>>>>>>>>> needs to be in sync. This is done by controlling the buffers size
> >>>>>>>>>> and could be fine-tuned by checking when the buffer transfer is done.
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>>
> >>>>>>>>> Just complementing: on media, we do this per video buffer (or
> >>>>>>>>> per half video buffer). A typical use case on cameras is to have
> >>>>>>>>> buffers transferred 30 times per second, if the video was streamed
> >>>>>>>>> at 30 frames per second.
> >>>>>>>>
> >>>>>>>> IIRC some big use case for this hardware was transcoding so there was a
> >>>>>>>> desire to just go at whatever rate the hardware could support as there
> >>>>>>>> is no interactive user consuming the output as it is generated.
> >>>>>>>
> >>>>>>> Indeed, codecs could be used to just do transcoding, but I would
> >>>>>>> expect it to be a border use case. See, as the chipsets implementing
> >>>>>>> codecs are typically the ones used on mobiles, I would expect that
> >>>>>>> the major use cases to be to watch audio and video and to participate
> >>>>>>> on audio/video conferences.
> >>>>>>>
> >>>>>>> Going further, the codec API may end supporting not only transcoding
> >>>>>>> (which is something that CPU can usually handle without too much
> >>>>>>> processing) but also audio processing that may require more
> >>>>>>> complex algorithms - even deep learning ones - like background noise
> >>>>>>> removal, echo detection/removal, volume auto-gain, audio enhancement
> >>>>>>> and such.
> >>>>>>>
> >>>>>>> On other words, the typical use cases will either have input
> >>>>>>> or output being a physical hardware (microphone or speaker).
> >>>>>>>
> >>>>>>
> >>>>>> All, thanks for spending time to discuss, it seems we go back to
> >>>>>> the start point of this topic again.
> >>>>>>
> >>>>>> Our main request is that there is a hardware sample rate converter
> >>>>>> on the chip, so users can use it in user space as a component like
> >>>>>> software sample rate converter. It mostly may run as a gstreamer plugin.
> >>>>>> so it is a memory to memory component.
> >>>>>>
> >>>>>> I didn't find such API in ALSA for such purpose, the best option for this
> >>>>>> in the kernel is the V4L2 memory to memory framework I found.
> >>>>>> As Hans said it is well designed for memory to memory.
> >>>>>>
> >>>>>> And I think audio is one of 'media'. As I can see that part of Radio
> >>>>>> function is in ALSA, part of Radio function is in V4L2. part of HDMI
> >>>>>> function is in DRM, part of HDMI function is in ALSA...
> >>>>>> So using V4L2 for audio is not new from this point of view.
> >>>>>>
> >>>>>> Even now I still think V4L2 is the best option, but it looks like there
> >>>>>> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
> >>>>>> a duplication of code (bigger duplication that just add audio support
> >>>>>> in V4L2 I think).
> >>>>>
> >>>>> After reading this thread I still believe that the mem2mem framework is
> >>>>> a reasonable option, unless someone can come up with a method that is
> >>>>> easy to implement in the alsa subsystem. From what I can tell from this
> >>>>> discussion no such method exists.
> >>>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> my main question would be how is mem2mem use case different from
> >>>> loopback exposing playback and capture frontends in user space with DSP
> >>>> (or other piece of HW) in the middle?
> >>>>
> >>> I think loopback has a timing control, user need to feed data to playback at a
> >>> fixed time and get data from capture at a fixed time. Otherwise there
> >>> is xrun in
> >>> playback and capture.
> >>>
> >>> mem2mem case: there is no such timing control, user feeds data to it
> >>> then it generates output, if user doesn't feed data, there is no xrun.
> >>> but mem2mem is just one of the components in the playback or capture
> >>> pipeline, overall there is time control for whole pipeline,
> >>>
> >>
> >> Have you looked at compress streams? If I remember correctly they are
> >> not tied to time due to the fact that they can pass data in arbitrary
> >> formats?
> >>
> >> From:
> >> https://docs.kernel.org/sound/designs/compress-offload.html
> >>
> >> "No notion of underrun/overrun. Since the bytes written are compressed
> >> in nature and data written/read doesn’t translate directly to rendered
> >> output in time, this does not deal with underrun/overrun and maybe dealt
> >> in user-library"
> >
> > I checked the compress stream. mem2mem case is different with
> > compress-offload case
> >
> > compress-offload case is a full pipeline, the user sends a compress
> > stream to it, then DSP decodes it and renders it to the speaker in real
> > time.
> >
> > mem2mem is just like the decoder in the compress pipeline. which is
> > one of the components in the pipeline.
>
> I was thinking of loopback with endpoints using compress streams,
> without physical endpoint, something like:
>
> compress playback (to feed data from userspace) -> DSP (processing) ->
> compress capture (send data back to userspace)
>
> Unless I'm missing something, you should be able to process data as fast
> as you can feed it and consume it in such case.
>

Actually in the beginning I tried this, but it did not work well.
ALSA needs time control for playback and capture, playback and capture
needs to synchronize. Usually the playback and capture pipeline is
independent in ALSA design, but in this case, the playback and capture
should synchronize, they are not independent.

Best regards
Shengjiu Wang

> Amadeusz

2024-05-09 11:14:19

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>> mem2mem is just like the decoder in the compress pipeline. which is
>>> one of the components in the pipeline.
>>
>> I was thinking of loopback with endpoints using compress streams,
>> without physical endpoint, something like:
>>
>> compress playback (to feed data from userspace) -> DSP (processing) ->
>> compress capture (send data back to userspace)
>>
>> Unless I'm missing something, you should be able to process data as fast
>> as you can feed it and consume it in such case.
>>
>
> Actually in the beginning I tried this, but it did not work well.
> ALSA needs time control for playback and capture, playback and capture
> needs to synchronize. Usually the playback and capture pipeline is
> independent in ALSA design, but in this case, the playback and capture
> should synchronize, they are not independent.

The core compress API core no strict timing constraints. You can eventually0
have two half-duplex compress devices, if you like to have really independent
mechanism. If something is missing in API, you can extend this API (like to
inform the user space that it's a producer/consumer processing without any
relation to the real time). I like this idea.

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-09 13:29:28

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 5/9/2024 11:36 AM, Shengjiu Wang wrote:
> On Wed, May 8, 2024 at 4:14 PM Amadeusz Sławiński
> <[email protected]> wrote:
>>
>> On 5/8/2024 10:00 AM, Hans Verkuil wrote:
>>> On 06/05/2024 10:49, Shengjiu Wang wrote:
>>>> On Fri, May 3, 2024 at 4:42 PM Mauro Carvalho Chehab <[email protected]> wrote:
>>>>>
>>>>> Em Fri, 3 May 2024 10:47:19 +0900
>>>>> Mark Brown <[email protected]> escreveu:
>>>>>
>>>>>> On Thu, May 02, 2024 at 10:26:43AM +0100, Mauro Carvalho Chehab wrote:
>>>>>>> Mauro Carvalho Chehab <[email protected]> escreveu:
>>>>>>
>>>>>>>> There are still time control associated with it, as audio and video
>>>>>>>> needs to be in sync. This is done by controlling the buffers size
>>>>>>>> and could be fine-tuned by checking when the buffer transfer is done.
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>> Just complementing: on media, we do this per video buffer (or
>>>>>>> per half video buffer). A typical use case on cameras is to have
>>>>>>> buffers transferred 30 times per second, if the video was streamed
>>>>>>> at 30 frames per second.
>>>>>>
>>>>>> IIRC some big use case for this hardware was transcoding so there was a
>>>>>> desire to just go at whatever rate the hardware could support as there
>>>>>> is no interactive user consuming the output as it is generated.
>>>>>
>>>>> Indeed, codecs could be used to just do transcoding, but I would
>>>>> expect it to be a border use case. See, as the chipsets implementing
>>>>> codecs are typically the ones used on mobiles, I would expect that
>>>>> the major use cases to be to watch audio and video and to participate
>>>>> on audio/video conferences.
>>>>>
>>>>> Going further, the codec API may end supporting not only transcoding
>>>>> (which is something that CPU can usually handle without too much
>>>>> processing) but also audio processing that may require more
>>>>> complex algorithms - even deep learning ones - like background noise
>>>>> removal, echo detection/removal, volume auto-gain, audio enhancement
>>>>> and such.
>>>>>
>>>>> On other words, the typical use cases will either have input
>>>>> or output being a physical hardware (microphone or speaker).
>>>>>
>>>>
>>>> All, thanks for spending time to discuss, it seems we go back to
>>>> the start point of this topic again.
>>>>
>>>> Our main request is that there is a hardware sample rate converter
>>>> on the chip, so users can use it in user space as a component like
>>>> software sample rate converter. It mostly may run as a gstreamer plugin.
>>>> so it is a memory to memory component.
>>>>
>>>> I didn't find such API in ALSA for such purpose, the best option for this
>>>> in the kernel is the V4L2 memory to memory framework I found.
>>>> As Hans said it is well designed for memory to memory.
>>>>
>>>> And I think audio is one of 'media'. As I can see that part of Radio
>>>> function is in ALSA, part of Radio function is in V4L2. part of HDMI
>>>> function is in DRM, part of HDMI function is in ALSA...
>>>> So using V4L2 for audio is not new from this point of view.
>>>>
>>>> Even now I still think V4L2 is the best option, but it looks like there
>>>> are a lot of rejects. If develop a new ALSA-mem2mem, it is also
>>>> a duplication of code (bigger duplication that just add audio support
>>>> in V4L2 I think).
>>>
>>> After reading this thread I still believe that the mem2mem framework is
>>> a reasonable option, unless someone can come up with a method that is
>>> easy to implement in the alsa subsystem. From what I can tell from this
>>> discussion no such method exists.
>>>
>>
>> Hi,
>>
>> my main question would be how is mem2mem use case different from
>> loopback exposing playback and capture frontends in user space with DSP
>> (or other piece of HW) in the middle?
>>
> I think loopback has a timing control, user need to feed data to playback at a
> fixed time and get data from capture at a fixed time. Otherwise there
> is xrun in
> playback and capture.
>
> mem2mem case: there is no such timing control, user feeds data to it
> then it generates output, if user doesn't feed data, there is no xrun.
> but mem2mem is just one of the components in the playback or capture
> pipeline, overall there is time control for whole pipeline,
>

Have you looked at compress streams? If I remember correctly they are
not tied to time due to the fact that they can pass data in arbitrary
formats?

From:
https://docs.kernel.org/sound/designs/compress-offload.html

"No notion of underrun/overrun. Since the bytes written are compressed
in nature and data written/read doesn’t translate directly to rendered
output in time, this does not deal with underrun/overrun and maybe dealt
in user-library"

Amadeusz

2024-05-13 11:57:24

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 09. 05. 24 13:13, Jaroslav Kysela wrote:
> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>> one of the components in the pipeline.
>>>
>>> I was thinking of loopback with endpoints using compress streams,
>>> without physical endpoint, something like:
>>>
>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>> compress capture (send data back to userspace)
>>>
>>> Unless I'm missing something, you should be able to process data as fast
>>> as you can feed it and consume it in such case.
>>>
>>
>> Actually in the beginning I tried this, but it did not work well.
>> ALSA needs time control for playback and capture, playback and capture
>> needs to synchronize. Usually the playback and capture pipeline is
>> independent in ALSA design, but in this case, the playback and capture
>> should synchronize, they are not independent.
>
> The core compress API core no strict timing constraints. You can eventually0
> have two half-duplex compress devices, if you like to have really independent
> mechanism. If something is missing in API, you can extend this API (like to
> inform the user space that it's a producer/consumer processing without any
> relation to the real time). I like this idea.

I was thinking more about this. If I am right, the mentioned use in gstreamer
is supposed to run the conversion (DSP) job in "one shot" (can be handled
using one system call like blocking ioctl). The goal is just to offload the
CPU work to the DSP (co-processor). If there are no requirements for the
queuing, we can implement this ioctl in the compress ALSA API easily using the
data management through the dma-buf API. We can eventually define a new
direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
handle this new data scheme. The API may be extended later on real demand, of
course.

Otherwise all pieces are already in the current ALSA compress API
(capabilities, params, enumeration). The realtime controls may be created
using ALSA control API.

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-15 09:17:27

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Hi Jaroslav,

On 5/13/24 13:56, Jaroslav Kysela wrote:
> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>> one of the components in the pipeline.
>>>>
>>>> I was thinking of loopback with endpoints using compress streams,
>>>> without physical endpoint, something like:
>>>>
>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>> compress capture (send data back to userspace)
>>>>
>>>> Unless I'm missing something, you should be able to process data as fast
>>>> as you can feed it and consume it in such case.
>>>>
>>>
>>> Actually in the beginning I tried this, but it did not work well.
>>> ALSA needs time control for playback and capture, playback and capture
>>> needs to synchronize. Usually the playback and capture pipeline is
>>> independent in ALSA design, but in this case, the playback and capture
>>> should synchronize, they are not independent.
>>
>> The core compress API core no strict timing constraints. You can eventually0
>> have two half-duplex compress devices, if you like to have really independent
>> mechanism. If something is missing in API, you can extend this API (like to
>> inform the user space that it's a producer/consumer processing without any
>> relation to the real time). I like this idea.
>
> I was thinking more about this. If I am right, the mentioned use in gstreamer
> is supposed to run the conversion (DSP) job in "one shot" (can be handled
> using one system call like blocking ioctl). The goal is just to offload the
> CPU work to the DSP (co-processor). If there are no requirements for the
> queuing, we can implement this ioctl in the compress ALSA API easily using the
> data management through the dma-buf API. We can eventually define a new
> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
> handle this new data scheme. The API may be extended later on real demand, of
> course.
>
> Otherwise all pieces are already in the current ALSA compress API
> (capabilities, params, enumeration). The realtime controls may be created
> using ALSA control API.

So does this mean that Shengjiu should attempt to use this ALSA approach first?

If there is a way to do this reasonably cleanly in the ALSA API, then that
obviously is much better from my perspective as a media maintainer.

My understanding was always that it can't be done (or at least not without
a major effort) in ALSA, and in that case V4L2 is a decent plan B, but based
on this I gather that it is possible in ALSA after all.

So can I shelf this patch series for now?

Regards,

Hans


2024-05-15 09:51:42

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 15. 05. 24 11:17, Hans Verkuil wrote:
> Hi Jaroslav,
>
> On 5/13/24 13:56, Jaroslav Kysela wrote:
>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>>> one of the components in the pipeline.
>>>>>
>>>>> I was thinking of loopback with endpoints using compress streams,
>>>>> without physical endpoint, something like:
>>>>>
>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>>> compress capture (send data back to userspace)
>>>>>
>>>>> Unless I'm missing something, you should be able to process data as fast
>>>>> as you can feed it and consume it in such case.
>>>>>
>>>>
>>>> Actually in the beginning I tried this, but it did not work well.
>>>> ALSA needs time control for playback and capture, playback and capture
>>>> needs to synchronize. Usually the playback and capture pipeline is
>>>> independent in ALSA design, but in this case, the playback and capture
>>>> should synchronize, they are not independent.
>>>
>>> The core compress API core no strict timing constraints. You can eventually0
>>> have two half-duplex compress devices, if you like to have really independent
>>> mechanism. If something is missing in API, you can extend this API (like to
>>> inform the user space that it's a producer/consumer processing without any
>>> relation to the real time). I like this idea.
>>
>> I was thinking more about this. If I am right, the mentioned use in gstreamer
>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
>> using one system call like blocking ioctl). The goal is just to offload the
>> CPU work to the DSP (co-processor). If there are no requirements for the
>> queuing, we can implement this ioctl in the compress ALSA API easily using the
>> data management through the dma-buf API. We can eventually define a new
>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
>> handle this new data scheme. The API may be extended later on real demand, of
>> course.
>>
>> Otherwise all pieces are already in the current ALSA compress API
>> (capabilities, params, enumeration). The realtime controls may be created
>> using ALSA control API.
>
> So does this mean that Shengjiu should attempt to use this ALSA approach first?

I've not seen any argument to use v4l2 mem2mem buffer scheme for this data
conversion forcefully. It looks like a simple job and ALSA APIs may be
extended for this simple purpose.

Shengjiu, what are your requirements for gstreamer support? Would be a new
blocking ioctl enough for the initial support in the compress ALSA API?

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-15 10:19:57

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Wed, 15 May 2024 11:50:52 +0200,
Jaroslav Kysela wrote:
>
> On 15. 05. 24 11:17, Hans Verkuil wrote:
> > Hi Jaroslav,
> >
> > On 5/13/24 13:56, Jaroslav Kysela wrote:
> >> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
> >>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
> >>>>>> mem2mem is just like the decoder in the compress pipeline. which is
> >>>>>> one of the components in the pipeline.
> >>>>>
> >>>>> I was thinking of loopback with endpoints using compress streams,
> >>>>> without physical endpoint, something like:
> >>>>>
> >>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
> >>>>> compress capture (send data back to userspace)
> >>>>>
> >>>>> Unless I'm missing something, you should be able to process data as fast
> >>>>> as you can feed it and consume it in such case.
> >>>>>
> >>>>
> >>>> Actually in the beginning I tried this, but it did not work well.
> >>>> ALSA needs time control for playback and capture, playback and capture
> >>>> needs to synchronize. Usually the playback and capture pipeline is
> >>>> independent in ALSA design, but in this case, the playback and capture
> >>>> should synchronize, they are not independent.
> >>>
> >>> The core compress API core no strict timing constraints. You can eventually0
> >>> have two half-duplex compress devices, if you like to have really independent
> >>> mechanism. If something is missing in API, you can extend this API (like to
> >>> inform the user space that it's a producer/consumer processing without any
> >>> relation to the real time). I like this idea.
> >>
> >> I was thinking more about this. If I am right, the mentioned use in gstreamer
> >> is supposed to run the conversion (DSP) job in "one shot" (can be handled
> >> using one system call like blocking ioctl). The goal is just to offload the
> >> CPU work to the DSP (co-processor). If there are no requirements for the
> >> queuing, we can implement this ioctl in the compress ALSA API easily using the
> >> data management through the dma-buf API. We can eventually define a new
> >> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
> >> handle this new data scheme. The API may be extended later on real demand, of
> >> course.
> >>
> >> Otherwise all pieces are already in the current ALSA compress API
> >> (capabilities, params, enumeration). The realtime controls may be created
> >> using ALSA control API.
> >
> > So does this mean that Shengjiu should attempt to use this ALSA approach first?
>
> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
> data conversion forcefully. It looks like a simple job and ALSA APIs
> may be extended for this simple purpose.
>
> Shengjiu, what are your requirements for gstreamer support? Would be a
> new blocking ioctl enough for the initial support in the compress ALSA
> API?

If it works with compress API, it'd be great, yeah.
So, your idea is to open compress-offload devices for read and write,
then and let them convert a la batch jobs without timing control?

For full-duplex usages, we might need some more extensions, so that
both read and write parameters can be synchronized. (So far the
compress stream is a unidirectional, and the runtime buffer for a
single stream.)

And the buffer management is based on the fixed size fragments. I
hope this doesn't matter much for the intended operation?


thanks,

Takashi

2024-05-15 10:48:54

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 15. 05. 24 12:19, Takashi Iwai wrote:
> On Wed, 15 May 2024 11:50:52 +0200,
> Jaroslav Kysela wrote:
>>
>> On 15. 05. 24 11:17, Hans Verkuil wrote:
>>> Hi Jaroslav,
>>>
>>> On 5/13/24 13:56, Jaroslav Kysela wrote:
>>>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>>>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>>>>> one of the components in the pipeline.
>>>>>>>
>>>>>>> I was thinking of loopback with endpoints using compress streams,
>>>>>>> without physical endpoint, something like:
>>>>>>>
>>>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>>>>> compress capture (send data back to userspace)
>>>>>>>
>>>>>>> Unless I'm missing something, you should be able to process data as fast
>>>>>>> as you can feed it and consume it in such case.
>>>>>>>
>>>>>>
>>>>>> Actually in the beginning I tried this, but it did not work well.
>>>>>> ALSA needs time control for playback and capture, playback and capture
>>>>>> needs to synchronize. Usually the playback and capture pipeline is
>>>>>> independent in ALSA design, but in this case, the playback and capture
>>>>>> should synchronize, they are not independent.
>>>>>
>>>>> The core compress API core no strict timing constraints. You can eventually0
>>>>> have two half-duplex compress devices, if you like to have really independent
>>>>> mechanism. If something is missing in API, you can extend this API (like to
>>>>> inform the user space that it's a producer/consumer processing without any
>>>>> relation to the real time). I like this idea.
>>>>
>>>> I was thinking more about this. If I am right, the mentioned use in gstreamer
>>>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
>>>> using one system call like blocking ioctl). The goal is just to offload the
>>>> CPU work to the DSP (co-processor). If there are no requirements for the
>>>> queuing, we can implement this ioctl in the compress ALSA API easily using the
>>>> data management through the dma-buf API. We can eventually define a new
>>>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
>>>> handle this new data scheme. The API may be extended later on real demand, of
>>>> course.
>>>>
>>>> Otherwise all pieces are already in the current ALSA compress API
>>>> (capabilities, params, enumeration). The realtime controls may be created
>>>> using ALSA control API.
>>>
>>> So does this mean that Shengjiu should attempt to use this ALSA approach first?
>>
>> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
>> data conversion forcefully. It looks like a simple job and ALSA APIs
>> may be extended for this simple purpose.
>>
>> Shengjiu, what are your requirements for gstreamer support? Would be a
>> new blocking ioctl enough for the initial support in the compress ALSA
>> API?
>
> If it works with compress API, it'd be great, yeah.
> So, your idea is to open compress-offload devices for read and write,
> then and let them convert a la batch jobs without timing control?
>
> For full-duplex usages, we might need some more extensions, so that
> both read and write parameters can be synchronized. (So far the
> compress stream is a unidirectional, and the runtime buffer for a
> single stream.)
>
> And the buffer management is based on the fixed size fragments. I
> hope this doesn't matter much for the intended operation?

It's a question, if the standard I/O is really required for this case. My
quick idea was to just implement a new "direction" for this job supporting
only one ioctl for the data processing which will execute the job in "one
shot" at the moment. The I/O may be handled through dma-buf API (which seems
to be standard nowadays for this purpose and allows future chaining).

So something like:

struct dsp_job {
int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
... maybe some extra data size members here ...
... maybe some special parameters here ...
};

#define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)

This ioctl will be blocking (thus synced). My question is, if it's feasible
for gstreamer or not. For this particular case, if the rate conversion is
implemented in software, it will block the gstreamer data processing, too.

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-15 13:35:07

by Shengjiu Wang

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On Wed, May 15, 2024 at 6:46 PM Jaroslav Kysela <[email protected]> wrote:
>
> On 15. 05. 24 12:19, Takashi Iwai wrote:
> > On Wed, 15 May 2024 11:50:52 +0200,
> > Jaroslav Kysela wrote:
> >>
> >> On 15. 05. 24 11:17, Hans Verkuil wrote:
> >>> Hi Jaroslav,
> >>>
> >>> On 5/13/24 13:56, Jaroslav Kysela wrote:
> >>>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
> >>>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
> >>>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
> >>>>>>>> one of the components in the pipeline.
> >>>>>>>
> >>>>>>> I was thinking of loopback with endpoints using compress streams,
> >>>>>>> without physical endpoint, something like:
> >>>>>>>
> >>>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
> >>>>>>> compress capture (send data back to userspace)
> >>>>>>>
> >>>>>>> Unless I'm missing something, you should be able to process data as fast
> >>>>>>> as you can feed it and consume it in such case.
> >>>>>>>
> >>>>>>
> >>>>>> Actually in the beginning I tried this, but it did not work well.
> >>>>>> ALSA needs time control for playback and capture, playback and capture
> >>>>>> needs to synchronize. Usually the playback and capture pipeline is
> >>>>>> independent in ALSA design, but in this case, the playback and capture
> >>>>>> should synchronize, they are not independent.
> >>>>>
> >>>>> The core compress API core no strict timing constraints. You can eventually0
> >>>>> have two half-duplex compress devices, if you like to have really independent
> >>>>> mechanism. If something is missing in API, you can extend this API (like to
> >>>>> inform the user space that it's a producer/consumer processing without any
> >>>>> relation to the real time). I like this idea.
> >>>>
> >>>> I was thinking more about this. If I am right, the mentioned use in gstreamer
> >>>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
> >>>> using one system call like blocking ioctl). The goal is just to offload the
> >>>> CPU work to the DSP (co-processor). If there are no requirements for the
> >>>> queuing, we can implement this ioctl in the compress ALSA API easily using the
> >>>> data management through the dma-buf API. We can eventually define a new
> >>>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
> >>>> handle this new data scheme. The API may be extended later on real demand, of
> >>>> course.
> >>>>
> >>>> Otherwise all pieces are already in the current ALSA compress API
> >>>> (capabilities, params, enumeration). The realtime controls may be created
> >>>> using ALSA control API.
> >>>
> >>> So does this mean that Shengjiu should attempt to use this ALSA approach first?
> >>
> >> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
> >> data conversion forcefully. It looks like a simple job and ALSA APIs
> >> may be extended for this simple purpose.
> >>
> >> Shengjiu, what are your requirements for gstreamer support? Would be a
> >> new blocking ioctl enough for the initial support in the compress ALSA
> >> API?
> >
> > If it works with compress API, it'd be great, yeah.
> > So, your idea is to open compress-offload devices for read and write,
> > then and let them convert a la batch jobs without timing control?
> >
> > For full-duplex usages, we might need some more extensions, so that
> > both read and write parameters can be synchronized. (So far the
> > compress stream is a unidirectional, and the runtime buffer for a
> > single stream.)
> >
> > And the buffer management is based on the fixed size fragments. I
> > hope this doesn't matter much for the intended operation?
>
> It's a question, if the standard I/O is really required for this case. My
> quick idea was to just implement a new "direction" for this job supporting
> only one ioctl for the data processing which will execute the job in "one
> shot" at the moment. The I/O may be handled through dma-buf API (which seems
> to be standard nowadays for this purpose and allows future chaining).
>
> So something like:
>
> struct dsp_job {
> int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
> int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
> ... maybe some extra data size members here ...
> ... maybe some special parameters here ...
> };
>
> #define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)
>
> This ioctl will be blocking (thus synced). My question is, if it's feasible
> for gstreamer or not. For this particular case, if the rate conversion is
> implemented in software, it will block the gstreamer data processing, too.
>

Thanks.

I have several questions:
1. Compress API alway binds to a sound card. Can we avoid that?
For ASRC, it is just one component,

2. Compress API doesn't seem to support mmap(). Is this a problem
for sending and getting data to/from the driver?

3. How does the user get output data from ASRC after each conversion?
it should happen every period.

best regards
Shengjiu Wang.

2024-05-15 14:05:00

by Pierre-Louis Bossart

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework



On 5/9/24 06:13, Jaroslav Kysela wrote:
> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>> one of the components in the pipeline.
>>>
>>> I was thinking of loopback with endpoints using compress streams,
>>> without physical endpoint, something like:
>>>
>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>> compress capture (send data back to userspace)
>>>
>>> Unless I'm missing something, you should be able to process data as fast
>>> as you can feed it and consume it in such case.
>>>
>>
>> Actually in the beginning I tried this,  but it did not work well.
>> ALSA needs time control for playback and capture, playback and capture
>> needs to synchronize.  Usually the playback and capture pipeline is
>> independent in ALSA design,  but in this case, the playback and capture
>> should synchronize, they are not independent.
>
> The core compress API core no strict timing constraints. You can
> eventually0 have two half-duplex compress devices, if you like to have
> really independent mechanism. If something is missing in API, you can
> extend this API (like to inform the user space that it's a
> producer/consumer processing without any relation to the real time). I
> like this idea.

The compress API was never intended to be used this way. It was meant to
send compressed data to a DSP for rendering, and keep the host processor
in a low-power state while the DSP local buffer was drained. There was
no intent to do a loop back to the host, because that keeps the host in
a high-power state and probably negates the power savings due to a DSP.

The other problem with the loopback is that the compress stuff is
usually a "Front-End" in ASoC/DPCM parlance, and we don't have a good
way to do a loopback between Front-Ends. The entire framework is based
on FEs being connected to BEs.

One problem that I can see for ASRC is that it's not clear when the data
will be completely processed on the "capture" stream when you stop the
"playback" stream. There's a non-zero risk of having a truncated output
or waiting for data that will never be generated.

In other words, it might be possible to reuse/extend the compress API
for a 'coprocessor' approach without any rendering to traditional
interfaces, but it's uncharted territory.

2024-05-15 20:34:04

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

Hi,

GStreamer hat on ...

Le mercredi 15 mai 2024 à 12:46 +0200, Jaroslav Kysela a écrit :
> On 15. 05. 24 12:19, Takashi Iwai wrote:
> > On Wed, 15 May 2024 11:50:52 +0200,
> > Jaroslav Kysela wrote:
> > >
> > > On 15. 05. 24 11:17, Hans Verkuil wrote:
> > > > Hi Jaroslav,
> > > >
> > > > On 5/13/24 13:56, Jaroslav Kysela wrote:
> > > > > On 09. 05. 24 13:13, Jaroslav Kysela wrote:
> > > > > > On 09. 05. 24 12:44, Shengjiu Wang wrote:
> > > > > > > > > mem2mem is just like the decoder in the compress pipeline which is
> > > > > > > > > one of the components in the pipeline.
> > > > > > > >
> > > > > > > > I was thinking of loopback with endpoints using compress streams,
> > > > > > > > without physical endpoint, something like:
> > > > > > > >
> > > > > > > > compress playback (to feed data from userspace) -> DSP (processing) ->
> > > > > > > > compress capture (send data back to userspace)
> > > > > > > >
> > > > > > > > Unless I'm missing something, you should be able to process data as fast
> > > > > > > > as you can feed it and consume it in such case.
> > > > > > > >
> > > > > > >
> > > > > > > Actually in the beginning I tried this, but it did not work well.
> > > > > > > ALSA needs time control for playback and capture, playback and capture
> > > > > > > needs to synchronize. Usually the playback and capture pipeline is
> > > > > > > independent in ALSA design, but in this case, the playback and capture
> > > > > > > should synchronize, they are not independent.
> > > > > >
> > > > > > The core compress API core no strict timing constraints. You can eventually0
> > > > > > have two half-duplex compress devices, if you like to have really independent
> > > > > > mechanism. If something is missing in API, you can extend this API (like to
> > > > > > inform the user space that it's a producer/consumer processing without any
> > > > > > relation to the real time). I like this idea.
> > > > >
> > > > > I was thinking more about this. If I am right, the mentioned use in gstreamer
> > > > > is supposed to run the conversion (DSP) job in "one shot" (can be handled
> > > > > using one system call like blocking ioctl). The goal is just to offload the
> > > > > CPU work to the DSP (co-processor). If there are no requirements for the
> > > > > queuing, we can implement this ioctl in the compress ALSA API easily using the
> > > > > data management through the dma-buf API. We can eventually define a new
> > > > > direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
> > > > > handle this new data scheme. The API may be extended later on real demand, of
> > > > > course.
> > > > >
> > > > > Otherwise all pieces are already in the current ALSA compress API
> > > > > (capabilities, params, enumeration). The realtime controls may be created
> > > > > using ALSA control API.
> > > >
> > > > So does this mean that Shengjiu should attempt to use this ALSA approach first?
> > >
> > > I've not seen any argument to use v4l2 mem2mem buffer scheme for this
> > > data conversion forcefully. It looks like a simple job and ALSA APIs
> > > may be extended for this simple purpose.
> > >
> > > Shengjiu, what are your requirements for gstreamer support? Would be a
> > > new blocking ioctl enough for the initial support in the compress ALSA
> > > API?
> >
> > If it works with compress API, it'd be great, yeah.
> > So, your idea is to open compress-offload devices for read and write,
> > then and let them convert a la batch jobs without timing control?
> >
> > For full-duplex usages, we might need some more extensions, so that
> > both read and write parameters can be synchronized. (So far the
> > compress stream is a unidirectional, and the runtime buffer for a
> > single stream.)
> >
> > And the buffer management is based on the fixed size fragments. I
> > hope this doesn't matter much for the intended operation?
>
> It's a question, if the standard I/O is really required for this case. My
> quick idea was to just implement a new "direction" for this job supporting
> only one ioctl for the data processing which will execute the job in "one
> shot" at the moment. The I/O may be handled through dma-buf API (which seems
> to be standard nowadays for this purpose and allows future chaining).
>
> So something like:
>
> struct dsp_job {
> int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
> int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
> ... maybe some extra data size members here ...
> ... maybe some special parameters here ...
> };
>
> #define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)
>
> This ioctl will be blocking (thus synced). My question is, if it's feasible
> for gstreamer or not. For this particular case, if the rate conversion is
> implemented in software, it will block the gstreamer data processing, too.

Yes, GStreamer threading is using a push-back model, so blocking for the time of
the processing is fine. Note that the extra simplicity will suffer from ioctl()
latency.

In GFX, they solve this issue with fences. That allow setting up the next
operation in the chain before the data has been produced.

In V4L2, we solve this with queues. It allows preparing the next job, while the
processing of the current job is happening. If you look at v4l2convert code in
gstreamer (for simple m2m), it currently makes no use of the queues, it simply
synchronously process the frames. There is two option, where it does not matter
that much, or no one is using it :-D Video decoders and encoders (stateful) do
run input / output from different thread to benefit from the queued.

regards,
Nicolas

>
> Jaroslav
>


2024-05-16 14:51:49

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 15. 05. 24 22:33, Nicolas Dufresne wrote:
> Hi,
>
> GStreamer hat on ...
>
> Le mercredi 15 mai 2024 à 12:46 +0200, Jaroslav Kysela a écrit :
>> On 15. 05. 24 12:19, Takashi Iwai wrote:
>>> On Wed, 15 May 2024 11:50:52 +0200,
>>> Jaroslav Kysela wrote:
>>>>
>>>> On 15. 05. 24 11:17, Hans Verkuil wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>> On 5/13/24 13:56, Jaroslav Kysela wrote:
>>>>>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>>>>>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>>>>>>> one of the components in the pipeline.
>>>>>>>>>
>>>>>>>>> I was thinking of loopback with endpoints using compress streams,
>>>>>>>>> without physical endpoint, something like:
>>>>>>>>>
>>>>>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>>>>>>> compress capture (send data back to userspace)
>>>>>>>>>
>>>>>>>>> Unless I'm missing something, you should be able to process data as fast
>>>>>>>>> as you can feed it and consume it in such case.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Actually in the beginning I tried this, but it did not work well.
>>>>>>>> ALSA needs time control for playback and capture, playback and capture
>>>>>>>> needs to synchronize. Usually the playback and capture pipeline is
>>>>>>>> independent in ALSA design, but in this case, the playback and capture
>>>>>>>> should synchronize, they are not independent.
>>>>>>>
>>>>>>> The core compress API core no strict timing constraints. You can eventually0
>>>>>>> have two half-duplex compress devices, if you like to have really independent
>>>>>>> mechanism. If something is missing in API, you can extend this API (like to
>>>>>>> inform the user space that it's a producer/consumer processing without any
>>>>>>> relation to the real time). I like this idea.
>>>>>>
>>>>>> I was thinking more about this. If I am right, the mentioned use in gstreamer
>>>>>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
>>>>>> using one system call like blocking ioctl). The goal is just to offload the
>>>>>> CPU work to the DSP (co-processor). If there are no requirements for the
>>>>>> queuing, we can implement this ioctl in the compress ALSA API easily using the
>>>>>> data management through the dma-buf API. We can eventually define a new
>>>>>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
>>>>>> handle this new data scheme. The API may be extended later on real demand, of
>>>>>> course.
>>>>>>
>>>>>> Otherwise all pieces are already in the current ALSA compress API
>>>>>> (capabilities, params, enumeration). The realtime controls may be created
>>>>>> using ALSA control API.
>>>>>
>>>>> So does this mean that Shengjiu should attempt to use this ALSA approach first?
>>>>
>>>> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
>>>> data conversion forcefully. It looks like a simple job and ALSA APIs
>>>> may be extended for this simple purpose.
>>>>
>>>> Shengjiu, what are your requirements for gstreamer support? Would be a
>>>> new blocking ioctl enough for the initial support in the compress ALSA
>>>> API?
>>>
>>> If it works with compress API, it'd be great, yeah.
>>> So, your idea is to open compress-offload devices for read and write,
>>> then and let them convert a la batch jobs without timing control?
>>>
>>> For full-duplex usages, we might need some more extensions, so that
>>> both read and write parameters can be synchronized. (So far the
>>> compress stream is a unidirectional, and the runtime buffer for a
>>> single stream.)
>>>
>>> And the buffer management is based on the fixed size fragments. I
>>> hope this doesn't matter much for the intended operation?
>>
>> It's a question, if the standard I/O is really required for this case. My
>> quick idea was to just implement a new "direction" for this job supporting
>> only one ioctl for the data processing which will execute the job in "one
>> shot" at the moment. The I/O may be handled through dma-buf API (which seems
>> to be standard nowadays for this purpose and allows future chaining).
>>
>> So something like:
>>
>> struct dsp_job {
>> int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
>> int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
>> ... maybe some extra data size members here ...
>> ... maybe some special parameters here ...
>> };
>>
>> #define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)
>>
>> This ioctl will be blocking (thus synced). My question is, if it's feasible
>> for gstreamer or not. For this particular case, if the rate conversion is
>> implemented in software, it will block the gstreamer data processing, too.
>
> Yes, GStreamer threading is using a push-back model, so blocking for the time of
> the processing is fine. Note that the extra simplicity will suffer from ioctl()
> latency.
>
> In GFX, they solve this issue with fences. That allow setting up the next
> operation in the chain before the data has been produced.

The fences look really nicely and seem more modern. It should be possible with
dma-buf/sync_file.c interface to handle multiple jobs simultaneously and share
the state between user space and kernel driver.

In this case, I think that two non-blocking ioctls should be enough - add a
new job with source/target dma buffers guarded by one fence and abort (flush)
all active jobs.

I'll try to propose an API extension for the ALSA's compress API in the
linux-sound mailing list soon.

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-16 14:59:33

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 15. 05. 24 15:34, Shengjiu Wang wrote:
> On Wed, May 15, 2024 at 6:46 PM Jaroslav Kysela <[email protected]> wrote:
>>
>> On 15. 05. 24 12:19, Takashi Iwai wrote:
>>> On Wed, 15 May 2024 11:50:52 +0200,
>>> Jaroslav Kysela wrote:
>>>>
>>>> On 15. 05. 24 11:17, Hans Verkuil wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>> On 5/13/24 13:56, Jaroslav Kysela wrote:
>>>>>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>>>>>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>>>>>>> one of the components in the pipeline.
>>>>>>>>>
>>>>>>>>> I was thinking of loopback with endpoints using compress streams,
>>>>>>>>> without physical endpoint, something like:
>>>>>>>>>
>>>>>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>>>>>>> compress capture (send data back to userspace)
>>>>>>>>>
>>>>>>>>> Unless I'm missing something, you should be able to process data as fast
>>>>>>>>> as you can feed it and consume it in such case.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Actually in the beginning I tried this, but it did not work well.
>>>>>>>> ALSA needs time control for playback and capture, playback and capture
>>>>>>>> needs to synchronize. Usually the playback and capture pipeline is
>>>>>>>> independent in ALSA design, but in this case, the playback and capture
>>>>>>>> should synchronize, they are not independent.
>>>>>>>
>>>>>>> The core compress API core no strict timing constraints. You can eventually0
>>>>>>> have two half-duplex compress devices, if you like to have really independent
>>>>>>> mechanism. If something is missing in API, you can extend this API (like to
>>>>>>> inform the user space that it's a producer/consumer processing without any
>>>>>>> relation to the real time). I like this idea.
>>>>>>
>>>>>> I was thinking more about this. If I am right, the mentioned use in gstreamer
>>>>>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
>>>>>> using one system call like blocking ioctl). The goal is just to offload the
>>>>>> CPU work to the DSP (co-processor). If there are no requirements for the
>>>>>> queuing, we can implement this ioctl in the compress ALSA API easily using the
>>>>>> data management through the dma-buf API. We can eventually define a new
>>>>>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
>>>>>> handle this new data scheme. The API may be extended later on real demand, of
>>>>>> course.
>>>>>>
>>>>>> Otherwise all pieces are already in the current ALSA compress API
>>>>>> (capabilities, params, enumeration). The realtime controls may be created
>>>>>> using ALSA control API.
>>>>>
>>>>> So does this mean that Shengjiu should attempt to use this ALSA approach first?
>>>>
>>>> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
>>>> data conversion forcefully. It looks like a simple job and ALSA APIs
>>>> may be extended for this simple purpose.
>>>>
>>>> Shengjiu, what are your requirements for gstreamer support? Would be a
>>>> new blocking ioctl enough for the initial support in the compress ALSA
>>>> API?
>>>
>>> If it works with compress API, it'd be great, yeah.
>>> So, your idea is to open compress-offload devices for read and write,
>>> then and let them convert a la batch jobs without timing control?
>>>
>>> For full-duplex usages, we might need some more extensions, so that
>>> both read and write parameters can be synchronized. (So far the
>>> compress stream is a unidirectional, and the runtime buffer for a
>>> single stream.)
>>>
>>> And the buffer management is based on the fixed size fragments. I
>>> hope this doesn't matter much for the intended operation?
>>
>> It's a question, if the standard I/O is really required for this case. My
>> quick idea was to just implement a new "direction" for this job supporting
>> only one ioctl for the data processing which will execute the job in "one
>> shot" at the moment. The I/O may be handled through dma-buf API (which seems
>> to be standard nowadays for this purpose and allows future chaining).
>>
>> So something like:
>>
>> struct dsp_job {
>> int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
>> int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
>> ... maybe some extra data size members here ...
>> ... maybe some special parameters here ...
>> };
>>
>> #define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)
>>
>> This ioctl will be blocking (thus synced). My question is, if it's feasible
>> for gstreamer or not. For this particular case, if the rate conversion is
>> implemented in software, it will block the gstreamer data processing, too.
>>
>
> Thanks.
>
> I have several questions:
> 1. Compress API alway binds to a sound card. Can we avoid that?
> For ASRC, it is just one component,

Is this a real issue? Usually, I would expect a sound hardware (card) presence
when ASRC is available, or not? Eventually, a separate sound card with one
compress device may be created, too. For enumeration - the user space may just
iterate through all sound cards / compress devices to find ASRC in the system.

The devices/interfaces in the sound card are independent. Also, USB MIDI
converters offer only one serial MIDI interface for example, too.

> 2. Compress API doesn't seem to support mmap(). Is this a problem
> for sending and getting data to/from the driver?

I proposed to use dma-buf for I/O (separate source and target buffer).

> 3. How does the user get output data from ASRC after each conversion?
> it should happen every period.

target dma-buf

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.


2024-05-27 07:25:40

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH v15 00/16] Add audio support in v4l2 framework

On 16. 05. 24 16:50, Jaroslav Kysela wrote:
> On 15. 05. 24 22:33, Nicolas Dufresne wrote:

>> In GFX, they solve this issue with fences. That allow setting up the next
>> operation in the chain before the data has been produced.
>
> The fences look really nicely and seem more modern. It should be possible with
> dma-buf/sync_file.c interface to handle multiple jobs simultaneously and share
> the state between user space and kernel driver.
>
> In this case, I think that two non-blocking ioctls should be enough - add a
> new job with source/target dma buffers guarded by one fence and abort (flush)
> all active jobs.
>
> I'll try to propose an API extension for the ALSA's compress API in the
> linux-sound mailing list soon.

I found using sync_file during the implementation to be overkill for resource
management, so I proposed a simple queue with the standard poll mechanism.

https://lore.kernel.org/linux-sound/[email protected]/

Jaroslav

--
Jaroslav Kysela <[email protected]>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.