2004-06-26 20:09:40

by Pete Zaitcev

[permalink] [raw]
Subject: drivers/block/ub.c

Hi, guys,

I have drafted up an implementation of a USB storage driver as I wish
it done (called "ub"). The main goal of the project is to produce a driver
which never oopses, and above all, never locks up the machine. To this
point I did all my debugging while running X11 and yapping on IRC. If this
goal requires to sacrifice performance, so be it. This is how ub differs
from the usb-storage.

The current usb-storage works quite well on servers where netdump can
be brought to bear, but on desktop its debuggability leaves some room
for improvement. In all other respects, it is superior to ub. Since
characteristics of usb-storage and ub are different I expect them to
coexist potentially indefinitely (if ub finds any use at all).

Please refer to the attached patch. It is quite raw, for instance the
disconnect is not processed at all, although I do have a plan for it.
This posting is largely a "release early release often" excercise, as
Papa ESR taught us. But you can see the design outline clearly now,
I hope, and I'm interested in feedback on that.

Best wishes,
-- Pete

diff -urpN -X dontdiff linux-2.6.7/drivers/block/Kconfig linux-2.6.7-ub/drivers/block/Kconfig
--- linux-2.6.7/drivers/block/Kconfig 2004-06-16 16:53:48.000000000 -0700
+++ linux-2.6.7-ub/drivers/block/Kconfig 2004-06-16 17:47:14.000000000 -0700
@@ -301,6 +301,14 @@ config BLK_DEV_CARMEL

Use devices /dev/carmel/$N and /dev/carmel/$Np$M.

+config BLK_DEV_UB
+ tristate "Low Performance USB Block driver"
+ depends on USB
+ help
+ This driver supports USB attached storage devices.
+
+ If unsure, say N.
+
config BLK_DEV_RAM
tristate "RAM disk support"
---help---
diff -urpN -X dontdiff linux-2.6.7/drivers/block/Makefile linux-2.6.7-ub/drivers/block/Makefile
--- linux-2.6.7/drivers/block/Makefile 2004-05-10 17:39:54.000000000 -0700
+++ linux-2.6.7-ub/drivers/block/Makefile 2004-06-16 17:47:14.000000000 -0700
@@ -42,4 +42,5 @@ obj-$(CONFIG_BLK_DEV_CRYPTOLOOP) += cryp

obj-$(CONFIG_VIODASD) += viodasd.o
obj-$(CONFIG_BLK_DEV_CARMEL) += carmel.o
+obj-$(CONFIG_BLK_DEV_UB) += ub.o

diff -urpN -X dontdiff linux-2.6.7/drivers/block/ub.c linux-2.6.7-ub/drivers/block/ub.c
--- linux-2.6.7/drivers/block/ub.c 1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6.7-ub/drivers/block/ub.c 2004-06-26 12:39:20.258021109 -0700
@@ -0,0 +1,1511 @@
+/*
+ * The low performance USB storage driver (ub).
+ *
+ * Copyright (c) 1999, 2000 Matthew Dharm ([email protected])
+ * Copyright (C) 2004 Pete Zaitcev
+ *
+ * This work is a part of Linux kernel, is derived from it,
+ * and is not licensed separately. See file COPYING for details.
+ *
+ * TODO
+ * -- disconnect, poison tests
+ * -- use serial numbers to hook onto same hosts (same minor) after disconnect
+ * -- verify protocol (bulk) from USB descriptors
+ * -- do inquiry and verify we got a disk and not a tape (for LUN mismatch)
+ * -- normal queue instead of sc->busy and friends
+ * -- normal pool of commands
+ * -- timers for all URBs
+ * -- highmem and sg
+ * -- verify that requests get merged (count 1 sector, 8 sector requests)
+ */
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/usb.h>
+#include <linux/blkdev.h>
+#include <linux/devfs_fs_kernel.h>
+
+#define DRV_NAME "ub"
+
+/*
+ * Definitions which have to be scattered once we understand the layout better.
+ */
+
+/* Transport (despite PR in the name) */
+#define US_PR_BULK 0x50 /* bulk only */
+
+/* Protocol */
+#define US_SC_SCSI 0x06 /* Transparent */
+
+/*
+ * SCSI opcodes: Common, Block, Streaming.
+ */
+#define TEST_UNIT_READY 0x00
+#define REQUEST_SENSE 0x03
+#define INQUIRY 0x12
+#define READ_10 0x28 /* Block */
+#define WRITE_10 0x2a /* Block */
+
+#define SENSE_SIZE 18
+
+/*
+ * Sense keys
+ */
+#define NO_SENSE 0x00
+#define RECOVERED_ERROR 0x01
+#define NOT_READY 0x02
+#define MEDIUM_ERROR 0x03
+#define HARDWARE_ERROR 0x04
+#define ILLEGAL_REQUEST 0x05
+#define UNIT_ATTENTION 0x06
+#define DATA_PROTECT 0x07
+#define BLANK_CHECK 0x08
+#define COPY_ABORTED 0x0a
+#define ABORTED_COMMAND 0x0b
+#define VOLUME_OVERFLOW 0x0d
+#define MISCOMPARE 0x0e
+
+/*
+ */
+#define UB_MINORS_PER_MAJOR 8
+
+#define MAX_CDB_SIZE 16 /* Corresponds to Bulk */
+
+/*
+ */
+
+/* command block wrapper */
+struct bulk_cb_wrap {
+ u32 Signature; /* contains 'USBC' */
+ u32 Tag; /* unique per command id */
+ u32 DataTransferLength; /* size of data */
+ u8 Flags; /* direction in bit 0 */
+ u8 Lun; /* LUN normally 0 */
+ u8 Length; /* of of the CDB */
+ u8 CDB[MAX_CDB_SIZE]; /* max command */
+};
+
+#define US_BULK_CB_WRAP_LEN 31
+#define US_BULK_CB_SIGN 0x43425355 /*spells out USBC */
+#define US_BULK_FLAG_IN 1
+#define US_BULK_FLAG_OUT 0
+
+/* command status wrapper */
+struct bulk_cs_wrap {
+ u32 Signature; /* should = 'USBS' */
+ u32 Tag; /* same as original command */
+ u32 Residue; /* amount not transferred */
+ u8 Status; /* see below */
+};
+
+#define US_BULK_CS_WRAP_LEN 13
+#define US_BULK_CS_SIGN 0x53425355 /* spells out 'USBS' */
+/* This is for Olympus Camedia digital cameras */
+#define US_BULK_CS_OLYMPUS_SIGN 0x55425355 /* spells out 'USBU' */
+#define US_BULK_STAT_OK 0
+#define US_BULK_STAT_FAIL 1
+#define US_BULK_STAT_PHASE 2
+
+/* bulk-only class specific requests */
+#define US_BULK_RESET_REQUEST 0xff
+#define US_BULK_GET_MAX_LUN 0xfe
+
+/*
+ */
+struct ub_dev;
+
+#define UB_MAX_REQ_SG 1
+
+/*
+ * An instance of a SCSI command in transit.
+ */
+#define UB_DIR_NONE 0
+#define UB_DIR_READ 1
+#define UB_DIR_ILLEGAL2 2
+#define UB_DIR_WRITE 3
+
+enum ub_scsi_cmd_state {
+ UB_CMDST_INIT, /* Initial state */
+ UB_CMDST_CMD, /* Command submitted */
+ UB_CMDST_DATA, /* Data phase */
+ UB_CMDST_CLR2STS, /* Clearing before requesting status */
+ UB_CMDST_STAT, /* Status phase */
+ UB_CMDST_CLEAR, /* Clearing a stall (halt, actually) */
+ UB_CMDST_SENSE, /* Sending Request Sense */
+ UB_CMDST_DONE /* Final state */
+};
+
+struct ub_scsi_cmd {
+ unsigned char cdb[MAX_CDB_SIZE];
+ unsigned char cdb_len;
+
+ unsigned char dir; /* 0 - none, 1 - read, 3 - write. */
+ enum ub_scsi_cmd_state state;
+ unsigned int tag;
+
+ int error; /* Return code - valid upon done */
+ int act_len; /* Return size */
+
+ int stat_count; /* Retries getting status. */
+
+ /*
+ * We do not support transfers from highmem pages
+ * because the underlying USB framework does not.
+ *
+ * XXX Actually, there is a (very fresh and buggy) support
+ * for transfers under control of struct scatterlist, usb_map_sg()
+ * in 2.6.6, but it seems to have issues with highmem.
+ */
+ char *data; /* Requested buffer */
+ unsigned int len; /* Requested length */
+ // struct scatterlist sgv[UB_MAX_REQ_SG];
+
+ void (*done)(struct ub_dev *, struct ub_scsi_cmd *);
+ void *back;
+};
+
+/*
+ */
+struct ub_capacity {
+ unsigned long nsec; /* Linux size - 512 byte sectors */
+ unsigned int bsize; /* Linux hardsect_size */
+ unsigned int bshift; /* Shift between 512 and hard sects */
+};
+
+/*
+ * The SCSI command tracing structure.
+ */
+
+#define SCMD_ST_HIST_SZ 8
+#define SCMD_TRACE_SZ 5
+
+struct ub_scsi_cmd_trace {
+ unsigned int tag;
+ unsigned char op;
+ unsigned char dir;
+ unsigned int req_size, act_size;
+ int hcur;
+ enum ub_scsi_cmd_state st_hst[SCMD_ST_HIST_SZ]; /* u8? */
+};
+
+struct ub_scsi_trace {
+ int cur;
+ struct ub_scsi_cmd_trace vec[SCMD_TRACE_SZ];
+};
+
+/*
+ * The UB device instance.
+ */
+struct ub_dev {
+ spinlock_t lock;
+ unsigned int id; /* Number among ub's */
+
+ unsigned int tagcnt;
+ int poison;
+ char name[8];
+ struct usb_device *dev;
+ struct usb_interface *intf;
+
+ struct ub_capacity capacity;
+ struct gendisk *disk;
+
+ unsigned int send_bulk_pipe; /* cached pipe values */
+ unsigned int recv_bulk_pipe;
+ unsigned int send_ctrl_pipe;
+ unsigned int recv_ctrl_pipe;
+
+ struct tasklet_struct urb_tasklet;
+
+ /* XXX Use Ingo's mempool (once we have more than one) */
+ int cmda[1];
+ struct ub_scsi_cmd cmdv[1];
+
+ int busy;
+ struct ub_scsi_cmd *top_cmd; /* XXX Under ->busy until we have a queue */
+ struct ub_scsi_cmd top_rqs_cmd; /* REQUEST SENSE */
+ unsigned char top_sense[SENSE_SIZE];
+
+ struct urb work_urb;
+ int last_pipe; /* What might need clearing */
+ struct bulk_cb_wrap work_bcb;
+ struct bulk_cs_wrap work_bcs;
+ struct usb_ctrlrequest work_cr;
+
+ struct ub_scsi_trace tr;
+};
+
+/*
+ */
+static void ub_rw_cmd_done(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
+static void ub_end_rq(struct request *rq, int uptodate);
+static int ub_submit_scsi(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
+static void ub_scsi_urb_complete(struct urb *urb, struct pt_regs *pt);
+static void ub_scsi_urb_action(unsigned long _dev);
+static void ub_scsi_urb_compl(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
+static void ub_state_stat(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
+static int ub_submit_clear_stall(struct ub_dev *sc, struct ub_scsi_cmd *cmd,
+ int stalled_pipe);
+static int ub_submit_top_sense(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
+
+/*
+ */
+static struct usb_device_id ub_usb_ids[] = {
+ { USB_DEVICE_VER(0x0781, 0x0002, 0x0009, 0x0009) }, /* SDDR-31 */
+ { }
+};
+
+MODULE_DEVICE_TABLE(usb, ub_usb_ids);
+
+static unsigned int ub_host_id;
+
+/*
+ * The SCSI command tracing procedures.
+ */
+
+static void ub_cmdtr_new(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ int n;
+ struct ub_scsi_cmd_trace *t;
+
+ if ((n = sc->tr.cur + 1) == SCMD_TRACE_SZ) n = 0;
+ t = &sc->tr.vec[n];
+
+ memset(t, 0, sizeof(struct ub_scsi_cmd_trace));
+ t->tag = cmd->tag;
+ t->op = cmd->cdb[0];
+ t->dir = cmd->dir;
+ t->req_size = cmd->len;
+ t->st_hst[0] = cmd->state;
+
+ sc->tr.cur = n;
+}
+
+static void ub_cmdtr_state(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ int n;
+ struct ub_scsi_cmd_trace *t;
+
+ t = &sc->tr.vec[sc->tr.cur];
+ if ((n = t->hcur + 1) == SCMD_ST_HIST_SZ) n = 0;
+ t->st_hst[n] = cmd->state;
+ t->hcur = n;
+}
+
+static void ub_cmdtr_act_len(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct ub_scsi_cmd_trace *t;
+
+ t = &sc->tr.vec[sc->tr.cur];
+ t->act_size = cmd->act_len;
+}
+
+/* The struct disk_attribute must match drivers/block/genhd.c */
+struct disk_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct gendisk *, char *);
+};
+
+static ssize_t ub_sysdiag_show(struct gendisk *disk, char *page)
+{
+ struct ub_dev *sc = disk->private_data;
+ int cnt = 0;
+ unsigned long flags;
+ int nc, nh;
+ int i, j;
+ struct ub_scsi_cmd_trace *t;
+
+ spin_lock_irqsave(&sc->lock, flags);
+ if ((nc = sc->tr.cur + 1) == SCMD_TRACE_SZ) nc = 0;
+ for (j = 0; j < SCMD_TRACE_SZ; j++) {
+ t = &sc->tr.vec[nc];
+
+ cnt += sprintf(page + cnt, "%08x %02x", t->tag, t->op);
+ cnt += sprintf(page + cnt, " %d", t->dir);
+ cnt += sprintf(page + cnt, " [%5d %5d]", t->req_size, t->act_size);
+ if ((nh = t->hcur + 1) == SCMD_ST_HIST_SZ) nh = 0;
+ for (i = 0; i < SCMD_ST_HIST_SZ; i++) {
+ cnt += sprintf(page + cnt, " %d", t->st_hst[nh]);
+ if (++nh == SCMD_ST_HIST_SZ) nh = 0;
+ }
+ cnt += sprintf(page + cnt, "\n");
+
+ if (++nc == SCMD_TRACE_SZ) nc = 0;
+ }
+ spin_unlock_irqrestore(&sc->lock, flags);
+ return cnt;
+}
+
+static struct disk_attribute ub_sysdiag_attr = {
+ .attr = {.name = "diag", .mode = S_IRUGO },
+ .show = ub_sysdiag_show
+};
+
+static int ub_sysdiag_create(struct ub_dev *sc)
+{
+ return sysfs_create_file(&sc->disk->kobj, &ub_sysdiag_attr.attr);
+}
+
+/*
+ * The "command allocator".
+ */
+static struct ub_scsi_cmd *ub_get_cmd(struct ub_dev *sc)
+{
+ struct ub_scsi_cmd *ret;
+
+ if (sc->cmda[0])
+ return NULL;
+ ret = &sc->cmdv[0];
+ sc->cmda[0] = 1;
+ return ret;
+}
+
+static void ub_put_cmd(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ if (cmd != &sc->cmdv[0]) {
+ printk(KERN_WARNING DRV_NAME ": "
+ "releasing a foreign cmd %p\n", cmd);
+ return;
+ }
+ if (!sc->cmda[0]) {
+ printk(KERN_WARNING DRV_NAME ": "
+ "releasing a free cmd\n");
+ return;
+ }
+ sc->cmda[0] = 0;
+}
+
+/*
+ * The request function is our main entry point
+ */
+
+static inline int ub_bd_rq_fn_1(request_queue_t *q)
+{
+#if 0
+ int writing = 0, pci_dir, i, n_elem;
+ u32 tmp;
+ unsigned int msg_size;
+#endif
+ struct ub_dev *sc = q->queuedata;
+ struct request *rq;
+#if 0 /* We use rq->buffer for now */
+ struct scatterlist *sg;
+ int n_elem;
+#endif
+ struct ub_scsi_cmd *cmd;
+ int ub_dir;
+ unsigned int block, nblks;
+ int rc;
+
+ if ((rq = elv_next_request(q)) == NULL)
+ return 1;
+
+ if ((cmd = ub_get_cmd(sc)) == NULL) {
+ blk_stop_queue(q);
+ return 1;
+ }
+
+ blkdev_dequeue_request(rq);
+
+ if (rq_data_dir(rq) == WRITE)
+ ub_dir = UB_DIR_WRITE;
+ else
+ ub_dir = UB_DIR_READ;
+
+ /*
+ * get scatterlist from block layer
+ */
+#if 0 /* We use rq->buffer for now */
+ sg = &cmd->sgv[0];
+ n_elem = blk_rq_map_sg(q, rq, sg);
+ if (n_elem <= 0) {
+ ub_put_cmd(sc, cmd);
+ ub_end_rq(rq, 0);
+ blk_start_queue(q);
+ return 0; /* request with no s/g entries? */
+ }
+
+ if (n_elem != 1) { /* Paranoia */
+ printk(KERN_WARNING DRV_NAME ": request with %d segments\n",
+ n_elem);
+ ub_put_cmd(sc, cmd);
+ ub_end_rq(rq, 0);
+ blk_start_queue(q);
+ return 0;
+ }
+#endif
+ /*
+ * XXX Unfortunately, this check does not work. It is quite possible
+ * to get bogus non-null rq->buffer if you allow sg by mistake.
+ */
+ if (rq->buffer == NULL) {
+ /*
+ * This must not happen if we set the queue right.
+ * The block level must create bounce buffers for us.
+ */
+ static int do_print = 1;
+ if (do_print) {
+ printk(KERN_WARNING DRV_NAME ": unmapped request\n");
+ do_print = 0;
+ }
+ ub_put_cmd(sc, cmd);
+ ub_end_rq(rq, 0);
+ blk_start_queue(q);
+ return 0;
+ }
+
+#if 0 /* Just to show what carmel does in this place... We won't need this. */
+ /* map scatterlist to PCI bus addresses */
+ n_elem = pci_map_sg(host->pdev, sg, n_elem, pci_dir);
+ if (n_elem <= 0) {
+ carm_end_rq(host, crq, 0);
+ return 1; /* request with no s/g entries? */
+ }
+ crq->n_elem = n_elem;
+ crq->port = port;
+ host->hw_sg_used += n_elem;
+#endif
+
+ /*
+ * build the command
+ */
+ block = rq->sector;
+ nblks = rq->nr_sectors;
+
+ memset(cmd, 0, sizeof(struct ub_scsi_cmd));
+ cmd->cdb[0] = (ub_dir == UB_DIR_READ)? READ_10: WRITE_10;
+ /* 10-byte uses 4 bytes of LBA: 2147483648KB, 2097152MB, 2048GB */
+ cmd->cdb[2] = block >> 24;
+ cmd->cdb[3] = block >> 16;
+ cmd->cdb[4] = block >> 8;
+ cmd->cdb[5] = block;
+ cmd->cdb[7] = nblks >> 8;
+ cmd->cdb[8] = nblks;
+ cmd->cdb_len = 10;
+ cmd->dir = ub_dir;
+ cmd->state = UB_CMDST_INIT;
+ cmd->data = rq->buffer;
+ cmd->len = nblks * 512;
+ cmd->done = ub_rw_cmd_done;
+ cmd->back = rq;
+
+ cmd->tag = sc->tagcnt++;
+ if ((rc = ub_submit_scsi(sc, cmd)) != 0) {
+ /* P3 */ printk("ub: cannot submit rw (%d)\n", rc);
+ ub_put_cmd(sc, cmd);
+ ub_end_rq(rq, 0);
+ blk_start_queue(q);
+ return 0;
+ }
+
+ return 0;
+}
+
+static void ub_bd_rq_fn(request_queue_t *q)
+{
+ do { } while (ub_bd_rq_fn_1(q) == 0);
+}
+
+static void ub_rw_cmd_done(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct request *rq = cmd->back;
+ struct gendisk *disk = sc->disk;
+ request_queue_t *q = disk->queue;
+ int uptodate;
+
+ if (cmd->error == 0)
+ uptodate = 1;
+ else
+ uptodate = 0;
+
+ ub_put_cmd(sc, cmd);
+ ub_end_rq(rq, uptodate);
+ blk_start_queue(q);
+}
+
+static void ub_end_rq(struct request *rq, int uptodate)
+{
+ int rc;
+
+ rc = end_that_request_first(rq, uptodate, rq->hard_nr_sectors);
+ // assert(rc == 0);
+ end_that_request_last(rq);
+}
+
+/*
+ * Submit a SCSI operation.
+ *
+ * This is only called customarily from a soft interrupt or a process,
+ * so it can call back without worrying about a recursion.
+ *
+ * The Iron Law of Good Submit Routine is:
+ * Zero return - callback is done, Nonzero return - callback is not done.
+ * No exceptions.
+ *
+ * Host is assumed locked.
+ *
+ * XXX We only support Bulk for the moment.
+ */
+static int ub_submit_scsi(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct bulk_cb_wrap *bcb;
+ int rc;
+
+ /*
+ * We do not trap into BUG() here because the consequences are
+ * sporadic stack overflows. Better to warn than to die outright.
+ */
+ if (in_irq()) {
+ static int first_warning = 1;
+ if (first_warning) {
+ printk(KERN_WARNING DRV_NAME ": "
+ "submitting command from a hard irq\n");
+ first_warning = 0;
+ }
+ }
+
+ /* XXX Set up a queue */
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ if (sc->busy)
+ return -EBUSY;
+ sc->busy = 1;
+ sc->top_cmd = cmd;
+ } else {
+ if (!sc->busy) {
+ /* P3 */ printk("ub: sensing while not busy\n");
+ return -EBUSY;
+ }
+ }
+
+ ub_cmdtr_new(sc, cmd);
+
+ if (cmd->state != UB_CMDST_INIT ||
+ (cmd->dir != UB_DIR_NONE && cmd->len == 0)) {
+ sc->top_cmd = NULL;
+ sc->busy = 0;
+ return -EINVAL;
+ }
+
+ bcb = &sc->work_bcb;
+
+ /* set up the command wrapper */
+ bcb->Signature = cpu_to_le32(US_BULK_CB_SIGN);
+ bcb->Tag = cmd->tag; /* Endianness is not important */
+ bcb->DataTransferLength = cpu_to_le32(cmd->len);
+ bcb->Flags = (cmd->dir == UB_DIR_READ) ? 0x80 : 0;
+ bcb->Lun = 0; /* No multi-LUN yet */
+ bcb->Length = cmd->cdb_len;
+
+ /* copy the command payload */
+ memcpy(bcb->CDB, cmd->cdb, MAX_CDB_SIZE);
+
+ sc->last_pipe = sc->send_bulk_pipe;
+ usb_fill_bulk_urb(&sc->work_urb, sc->dev, sc->send_bulk_pipe,
+ bcb, US_BULK_CB_WRAP_LEN,
+ ub_scsi_urb_complete, sc);
+
+ /* Fill what we shouldn't be filling just because usb-storage did. */
+ sc->work_urb.actual_length = 0;
+ sc->work_urb.error_count = 0;
+ sc->work_urb.status = 0;
+
+ /* XXX Add a timeout (always, because we have no aborts) */
+
+ if ((rc = usb_submit_urb(&sc->work_urb, GFP_ATOMIC)) != 0) {
+
+ /* XXX Clear stalls */
+ printk("ub: cmd #%d submit failed (%d)\n", cmd->tag, rc); /* P3 */
+
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ return rc;
+ }
+
+ cmd->state = UB_CMDST_CMD;
+ ub_cmdtr_state(sc, cmd);
+ return 0;
+}
+
+/*
+ * Completion routine for the work URB.
+ *
+ * This can be called directly from usb_submit_urb (while we have
+ * the sc->lock taken) and from an interrupt (while we do NOT have
+ * the sc->lock taken). Therefore, bounce this off to a tasklet.
+ */
+static void ub_scsi_urb_complete(struct urb *urb, struct pt_regs *pt)
+{
+ struct ub_dev *sc = urb->context;
+
+ tasklet_schedule(&sc->urb_tasklet);
+}
+
+static void ub_scsi_urb_action(unsigned long _dev)
+{
+ struct ub_dev *sc = (struct ub_dev *) _dev;
+ unsigned long flags;
+ struct ub_scsi_cmd *cmd;
+
+ spin_lock_irqsave(&sc->lock, flags);
+ if (sc->busy) {
+ cmd = sc->top_cmd;
+ if (cmd->state == UB_CMDST_SENSE) { /* XXX Kludgy */
+ cmd = &sc->top_rqs_cmd;
+ }
+ ub_scsi_urb_compl(sc, cmd);
+ } else {
+ /* Never happens */
+ /* P3 */ printk("ub: Action on idle device\n");
+ }
+ spin_unlock_irqrestore(&sc->lock, flags);
+}
+
+static void ub_scsi_urb_compl(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct urb *urb = &sc->work_urb;
+ struct bulk_cs_wrap *bcs;
+ int pipe;
+ int rc;
+
+/* P3 */ printk("ub: urb status %d pipe 0x%08x len %d act %d\n",
+ urb->status, urb->pipe, urb->transfer_buffer_length, urb->actual_length);
+
+ /* XXX if (sc->poison) */
+
+ if (cmd->state == UB_CMDST_CLEAR) {
+ if (urb->status == -EPIPE) {
+ /*
+ * STALL while clearning STALL.
+ * A STALL is illegal on a control pipe!
+ * XXX Might try to reset the device here and retry.
+ */
+ printk(KERN_NOTICE DRV_NAME ": "
+ "stall on control pipe for device %u\n",
+ sc->dev->devnum);
+ goto Bad_End;
+ }
+
+ /*
+ * We ignore the result for the halt clear.
+ */
+
+ /* reset the toggles and endpoint flags */
+ usb_endpoint_running(sc->dev, usb_pipeendpoint(sc->last_pipe),
+ usb_pipeout(sc->last_pipe));
+ usb_settoggle(sc->dev, usb_pipeendpoint(sc->last_pipe),
+ usb_pipeout(sc->last_pipe), 0);
+
+ if (cmd->cdb[0] == REQUEST_SENSE) {
+ cmd->error = -EPIPE;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+ return;
+ }
+
+ memset(&sc->top_sense, 0, SENSE_SIZE);
+ if ((rc = ub_submit_top_sense(sc, cmd)) != 0) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "unable to submit sense for device %u (%d)\n",
+ sc->dev->devnum, rc);
+ goto Bad_End;
+ }
+ cmd->state = UB_CMDST_SENSE;
+ ub_cmdtr_state(sc, cmd);
+
+ } else if (cmd->state == UB_CMDST_CLR2STS) {
+ if (urb->status == -EPIPE) {
+ /*
+ * STALL while clearning STALL.
+ * A STALL is illegal on a control pipe!
+ * XXX Might try to reset the device here and retry.
+ */
+ printk(KERN_NOTICE DRV_NAME ": "
+ "stall on control pipe for device %u\n",
+ sc->dev->devnum);
+ goto Bad_End;
+ }
+
+ /*
+ * We ignore the result for the halt clear.
+ */
+
+ /* reset the toggles and endpoint flags */
+ usb_endpoint_running(sc->dev, usb_pipeendpoint(sc->last_pipe),
+ usb_pipeout(sc->last_pipe));
+ usb_settoggle(sc->dev, usb_pipeendpoint(sc->last_pipe),
+ usb_pipeout(sc->last_pipe), 0);
+
+ ub_state_stat(sc, cmd);
+
+ } else if (cmd->state == UB_CMDST_CMD) {
+ if (urb->status == -EPIPE) {
+ rc = ub_submit_clear_stall(sc, cmd, sc->last_pipe);
+ if (rc != 0) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "unable to submit clear for device %u (%d)\n",
+ sc->dev->devnum, rc);
+ /*
+ * This is typically ENOMEM or some other such shit.
+ * Retrying is pointless. Just do Bad End on it...
+ */
+ goto Bad_End;
+ }
+ cmd->state = UB_CMDST_CLEAR;
+ ub_cmdtr_state(sc, cmd);
+ return;
+ }
+ if (urb->status != 0)
+ goto Bad_End;
+ if (urb->actual_length != US_BULK_CB_WRAP_LEN) {
+ /* XXX Must do reset here to unconfuse the device */
+ goto Bad_End;
+ }
+
+ if (cmd->dir == UB_DIR_NONE) {
+ ub_state_stat(sc, cmd);
+ return;
+ }
+
+ if (cmd->dir == UB_DIR_READ)
+ pipe = sc->recv_bulk_pipe;
+ else
+ pipe = sc->send_bulk_pipe;
+ sc->last_pipe = pipe;
+ usb_fill_bulk_urb(&sc->work_urb, sc->dev, pipe,
+ cmd->data, cmd->len,
+ ub_scsi_urb_complete, sc);
+ sc->work_urb.actual_length = 0;
+ sc->work_urb.error_count = 0;
+ sc->work_urb.status = 0;
+
+ if ((rc = usb_submit_urb(&sc->work_urb, GFP_ATOMIC)) != 0) {
+
+ /* XXX Clear stalls */
+ printk("ub: data #%d submit failed (%d)\n", cmd->tag, rc); /* P3 */
+
+ cmd->error = rc;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+ return;
+ }
+
+ cmd->state = UB_CMDST_DATA;
+ ub_cmdtr_state(sc, cmd);
+
+ } else if (cmd->state == UB_CMDST_DATA) {
+ if (urb->status == -EPIPE) {
+ rc = ub_submit_clear_stall(sc, cmd, sc->last_pipe);
+ if (rc != 0) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "unable to submit clear for device %u (%d)\n",
+ sc->dev->devnum, rc);
+ /*
+ * This is typically ENOMEM or some other such shit.
+ * Retrying is pointless. Just do Bad End on it...
+ */
+ goto Bad_End;
+ }
+ cmd->state = UB_CMDST_CLR2STS;
+ ub_cmdtr_state(sc, cmd);
+ return;
+ }
+ if (urb->status == -EOVERFLOW) {
+ /*
+ * A babble? Failure, but we must transfer CSW now.
+ */
+ cmd->error = -EOVERFLOW; /* A cheap trick... */
+ } else {
+ if (urb->status != 0)
+ goto Bad_End;
+ }
+
+ cmd->act_len = urb->actual_length;
+ ub_cmdtr_act_len(sc, cmd);
+
+ ub_state_stat(sc, cmd);
+
+ } else if (cmd->state == UB_CMDST_STAT) {
+ if (urb->status == -EPIPE) {
+ rc = ub_submit_clear_stall(sc, cmd, sc->last_pipe);
+ if (rc != 0) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "unable to submit clear for device %u (%d)\n",
+ sc->dev->devnum, rc);
+ /*
+ * This is typically ENOMEM or some other such shit.
+ * Retrying is pointless. Just do Bad End on it...
+ */
+ goto Bad_End;
+ }
+ cmd->state = UB_CMDST_CLEAR;
+ ub_cmdtr_state(sc, cmd);
+ return;
+ }
+ if (urb->status != 0)
+ goto Bad_End;
+
+ if (urb->actual_length == 0) {
+ /*
+ * Some broken devices add unnecessary zero-length
+ * packets to the end of their data transfers.
+ * Such packets show up as 0-length CSWs. If we
+ * encounter such a thing, try to read the CSW again.
+ */
+ if (++cmd->stat_count >= 4) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "unable to get CSW on device %u\n",
+ sc->dev->devnum);
+ goto Bad_End;
+ }
+
+ /*
+ * ub_state_stat only not dropping the count...
+ */
+ sc->last_pipe = sc->recv_bulk_pipe;
+ usb_fill_bulk_urb(&sc->work_urb, sc->dev,
+ sc->recv_bulk_pipe, &sc->work_bcs,
+ US_BULK_CS_WRAP_LEN, ub_scsi_urb_complete, sc);
+ sc->work_urb.actual_length = 0;
+ sc->work_urb.error_count = 0;
+ sc->work_urb.status = 0;
+
+ rc = usb_submit_urb(&sc->work_urb, GFP_ATOMIC);
+ if (rc != 0) {
+ /* XXX Clear stalls */
+ printk("ub: CSW #%d submit failed (%d)\n", cmd->tag, rc); /* P3 */
+
+ cmd->error = rc;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+ return;
+ }
+ return;
+ }
+
+ /*
+ * Check the returned Bulk protocol status.
+ */
+
+ bcs = &sc->work_bcs;
+ rc = le32_to_cpu(bcs->Residue);
+ if (rc != cmd->len - cmd->act_len) {
+ /*
+ * It is all right to transfer less, the caller has
+ * to check. But it's not all right if the device
+ * counts disagree with our counts.
+ */
+ /* P3 */ printk("ub: resid %d len %d act %d\n",
+ rc, cmd->len, cmd->act_len);
+ goto Bad_End;
+ }
+
+ if (bcs->Signature != cpu_to_le32(US_BULK_CS_SIGN) &&
+ bcs->Signature != cpu_to_le32(US_BULK_CS_OLYMPUS_SIGN)) {
+ /* XXX Rate-limit, even for P3 tagged */
+ /* P3 */ printk("ub: signature 0x%x\n", bcs->Signature);
+ /* Windows ignores signatures, so do we. */
+ }
+
+ if (bcs->Tag != cmd->tag) {
+ /* P3 */ printk("ub: tag orig 0x%x reply 0x%x\n",
+ cmd->tag, bcs->Tag);
+ goto Bad_End;
+ }
+
+ switch (bcs->Status) {
+ case US_BULK_STAT_OK:
+ break;
+
+ case US_BULK_STAT_FAIL:
+ /* P3 */ printk("ub: status FAIL\n");
+ goto Bad_End;
+
+ case US_BULK_STAT_PHASE:
+ /* XXX We must reset the transport here */
+ /* P3 */ printk("ub: status PHASE\n");
+ goto Bad_End;
+ }
+
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+
+ } else if (cmd->state == UB_CMDST_SENSE) {
+ if (urb->status == -EPIPE) {
+ /*
+ * This is not possible, because other instance
+ * of a command state machine processes this.
+ * Definitely badend this shit, and do non-P3 warning.
+ */
+ printk(KERN_NOTICE DRV_NAME ": "
+ "stall while sensing device %u\n",
+ sc->dev->devnum);
+ goto Bad_End;
+ }
+ if (urb->status != 0)
+ goto Bad_End;
+
+ /*
+ * We do not look at sense, because even if there was no
+ * sense, we only get into UB_CMDST_SENSE from a STALL.
+ * We request sense because we want to clear CHECK CONDITION
+ * on devices with delusions of SCSI, and not because we
+ * are curious in any way about the sense itself.
+ */
+ /* if ((cmd->top_sense[2] & 0x0F) == NO_SENSE) ..... */
+
+ cmd->error = -EIO;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+ } else {
+ printk(KERN_WARNING DRV_NAME ": "
+ "wrong command state %d on device %u\n",
+ cmd->state, sc->dev->devnum);
+ goto Bad_End;
+ }
+ return;
+
+Bad_End: /* 4chan R.I.P. XXX Remove later when have processing for aborts. */
+ cmd->error = -EIO;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+}
+
+/*
+ * Factorization helper for the command state machine:
+ * Submit a CSW read and go to STAT state.
+ */
+static void ub_state_stat(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ int rc;
+
+ sc->last_pipe = sc->recv_bulk_pipe;
+ usb_fill_bulk_urb(&sc->work_urb, sc->dev, sc->recv_bulk_pipe,
+ &sc->work_bcs, US_BULK_CS_WRAP_LEN,
+ ub_scsi_urb_complete, sc);
+ sc->work_urb.actual_length = 0;
+ sc->work_urb.error_count = 0;
+ sc->work_urb.status = 0;
+
+ if ((rc = usb_submit_urb(&sc->work_urb, GFP_ATOMIC)) != 0) {
+
+ /* XXX Clear stalls */
+ printk("ub: CSW #%d submit failed (%d)\n", cmd->tag, rc); /* P3 */
+
+ cmd->error = rc;
+ cmd->state = UB_CMDST_DONE;
+ ub_cmdtr_state(sc, cmd);
+ if (cmd != &sc->top_rqs_cmd) { /* XXX This is getting out of hand. */
+ sc->busy = 0;
+ sc->top_cmd = NULL;
+ }
+ (*cmd->done)(sc, cmd);
+ return;
+ }
+
+ cmd->stat_count = 0;
+ cmd->state = UB_CMDST_STAT;
+ ub_cmdtr_state(sc, cmd);
+}
+
+/*
+ * A helper for the command's state machine:
+ * Submit a stall clear.
+ */
+static int ub_submit_clear_stall(struct ub_dev *sc, struct ub_scsi_cmd *cmd,
+ int stalled_pipe)
+{
+ int endp;
+ struct usb_ctrlrequest *cr;
+
+ endp = usb_pipeendpoint(stalled_pipe);
+ if (usb_pipein (stalled_pipe))
+ endp |= USB_DIR_IN;
+
+ cr = &sc->work_cr;
+ cr->bRequestType = USB_RECIP_ENDPOINT;
+ cr->bRequest = USB_REQ_CLEAR_FEATURE;
+ cr->wValue = cpu_to_le16(USB_ENDPOINT_HALT);
+ cr->wIndex = cpu_to_le16(endp);
+ cr->wLength = cpu_to_le16(0);
+
+ usb_fill_control_urb(&sc->work_urb, sc->dev, sc->send_ctrl_pipe,
+ (unsigned char*) cr, NULL, 0,
+ ub_scsi_urb_complete, sc);
+
+ sc->work_urb.actual_length = 0;
+ sc->work_urb.error_count = 0;
+ sc->work_urb.status = 0;
+
+ return usb_submit_urb(&sc->work_urb, GFP_ATOMIC);
+}
+
+/*
+ */
+static void ub_top_sense_done(struct ub_dev *sc, struct ub_scsi_cmd *scmd)
+{
+ unsigned char *sense = scmd->data;
+ struct ub_scsi_cmd *cmd;
+
+ printk("ub: sense bytes %d key 0x%x asc 0x%x ascq 0x%x\n",
+ sense[7], sense[2] & 0x0F, sense[12], sense[13]); /* P3 */
+
+ if (!sc->busy) { /* P3 */
+ printk(KERN_WARNING DRV_NAME ": sense done while idle\n");
+ return;
+ }
+ cmd = scmd->back;
+ if (cmd->state != UB_CMDST_SENSE) {
+ printk(KERN_WARNING DRV_NAME ": "
+ "sense done with bad cmd state %d on device %u\n",
+ cmd->state, sc->dev->devnum);
+ return;
+ }
+
+ ub_scsi_urb_compl(sc, cmd);
+}
+
+static int ub_submit_top_sense(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct ub_scsi_cmd *scmd;
+
+ scmd = &sc->top_rqs_cmd;
+
+ /* XXX Can be done at init */
+ scmd->cdb[0] = REQUEST_SENSE;
+ scmd->cdb_len = 6;
+ scmd->dir = UB_DIR_READ;
+ scmd->state = UB_CMDST_INIT;
+ scmd->data = sc->top_sense;
+ scmd->len = SENSE_SIZE;
+ scmd->done = ub_top_sense_done;
+ scmd->back = cmd;
+
+ scmd->tag = sc->tagcnt++;
+ return ub_submit_scsi(sc, scmd);
+}
+
+#if 0
+/* Determine what the maximum LUN supported is */
+int usb_stor_Bulk_max_lun(struct us_data *us)
+{
+ int result;
+
+ /* issue the command */
+ result = usb_stor_control_msg(us, us->recv_ctrl_pipe,
+ US_BULK_GET_MAX_LUN,
+ USB_DIR_IN | USB_TYPE_CLASS |
+ USB_RECIP_INTERFACE,
+ 0, us->ifnum, us->iobuf, 1, HZ);
+
+ /*
+ * Some devices (i.e. Iomega Zip100) need this -- apparently
+ * the bulk pipes get STALLed when the GetMaxLUN request is
+ * processed. This is, in theory, harmless to all other devices
+ * (regardless of if they stall or not).
+ */
+ if (result < 0) {
+ usb_stor_clear_halt(us, us->recv_bulk_pipe);
+ usb_stor_clear_halt(us, us->send_bulk_pipe);
+ }
+
+ US_DEBUGP("GetMaxLUN command result is %d, data is %d\n",
+ result, us->iobuf[0]);
+
+ /* if we have a successful request, return the result */
+ if (result == 1)
+ return us->iobuf[0];
+
+ /* return the default -- no LUNs */
+ return 0;
+}
+#endif
+
+static int ub_bd_ioctl(struct inode *inode, struct file *filp,
+ unsigned int cmd, unsigned long arg)
+{
+// void __user *usermem = (void *) arg;
+// struct carm_port *port = ino->i_bdev->bd_disk->private_data;
+// struct hd_geometry geom;
+
+#if 0
+ switch (cmd) {
+ case HDIO_GETGEO:
+ if (usermem == NULL) // XXX Bizzare. Why?
+ return -EINVAL;
+
+ geom.heads = (u8) port->dev_geom_head;
+ geom.sectors = (u8) port->dev_geom_sect;
+ geom.cylinders = port->dev_geom_cyl;
+ geom.start = get_start_sect(ino->i_bdev);
+
+ if (copy_to_user(usermem, &geom, sizeof(geom)))
+ return -EFAULT;
+ return 0;
+
+ default: ;
+ }
+#endif
+
+ return -ENOTTY;
+}
+
+static struct block_device_operations ub_bd_fops = {
+ .owner = THIS_MODULE,
+ .ioctl = ub_bd_ioctl,
+ // .revalidate_disk = cciss_revalidate,
+};
+
+/*
+ * Read the SCSI capacity synchronously (for probing).
+ */
+static void ub_probe_read_cap_done(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
+{
+ struct completion *cop = cmd->back;
+ complete(cop);
+}
+
+static int ub_probe_read_cap(struct ub_dev *sc, struct ub_capacity *ret)
+{
+ struct ub_scsi_cmd *cmd;
+ char *p;
+ enum { ALLOC_SIZE = sizeof(struct ub_scsi_cmd) + 8 };
+ unsigned long flags;
+ unsigned int bsize, shift;
+ unsigned long nsec;
+ struct completion compl;
+ int rc;
+
+ init_completion(&compl);
+
+ rc = -ENOMEM;
+ if ((cmd = kmalloc(ALLOC_SIZE, GFP_KERNEL)) == NULL)
+ goto err_alloc;
+ memset(cmd, 0, ALLOC_SIZE);
+ p = (char *)cmd + sizeof(struct ub_scsi_cmd);
+
+ cmd->cdb[0] = 0x25;
+ cmd->cdb_len = 10;
+ cmd->dir = UB_DIR_READ;
+ cmd->state = UB_CMDST_INIT;
+ cmd->data = p;
+ cmd->len = 8;
+ cmd->done = ub_probe_read_cap_done;
+ cmd->back = &compl;
+
+ spin_lock_irqsave(&sc->lock, flags);
+ cmd->tag = sc->tagcnt++;
+
+ rc = ub_submit_scsi(sc, cmd);
+ spin_unlock_irqrestore(&sc->lock, flags);
+
+ if (rc != 0) {
+ printk("ub: reading capacity: submit error (%d)\n", rc); /* P3 */
+ goto err_submit;
+ }
+
+ wait_for_completion(&compl);
+
+ if (cmd->error != 0) {
+ printk("ub: reading capacity: error %d\n", cmd->error); /* P3 */
+ rc = -EIO;
+ goto err_read;
+ }
+ if (cmd->act_len != 8) {
+ printk("ub: reading capacity: size %d\n", cmd->act_len); /* P3 */
+ rc = -EIO;
+ goto err_read;
+ }
+
+ nsec = be32_to_cpu(*(u32 *)p) + 1;
+ bsize = be32_to_cpu(*(u32 *)(p + 4));
+ switch (bsize) {
+ case 512: shift = 0; break;
+ case 1024: shift = 1; break;
+ case 2048: shift = 2; break;
+ case 4096: shift = 3; break;
+ default:
+ printk("ub: Bad sector size %u\n", bsize); /* P3 */
+ rc = -EDOM;
+ goto err_inv_bsize;
+ }
+
+ ret->bsize = bsize;
+ ret->bshift = shift;
+ ret->nsec = nsec << shift;
+ rc = 0;
+
+err_inv_bsize:
+err_read:
+err_submit:
+ kfree(cmd);
+err_alloc:
+ return rc;
+}
+
+/* Get the pipe settings */
+static int ub_get_pipes(struct ub_dev *sc, struct usb_device *dev,
+ struct usb_interface *intf)
+{
+ struct usb_host_interface *altsetting = intf->cur_altsetting;
+ struct usb_endpoint_descriptor *ep_in = NULL;
+ struct usb_endpoint_descriptor *ep_out = NULL;
+ struct usb_endpoint_descriptor *ep;
+ int i;
+
+ /*
+ * Find the endpoints we need.
+ * We are expecting a minimum of 2 endpoints - in and out (bulk).
+ * We will ignore any others.
+ */
+ for (i = 0; i < altsetting->desc.bNumEndpoints; i++) {
+ ep = &altsetting->endpoint[i].desc;
+
+ /* Is it a BULK endpoint? */
+ if ((ep->bmAttributes & USB_ENDPOINT_XFERTYPE_MASK)
+ == USB_ENDPOINT_XFER_BULK) {
+ /* BULK in or out? */
+ if (ep->bEndpointAddress & USB_DIR_IN)
+ ep_in = ep;
+ else
+ ep_out = ep;
+ }
+ }
+
+ if (ep_in == NULL || ep_out == NULL) {
+ printk(KERN_NOTICE DRV_NAME ": "
+ "device %u failed endpoint check\n",
+ sc->dev->devnum);
+ return -EIO;
+ }
+
+ /* Calculate and store the pipe values */
+ sc->send_ctrl_pipe = usb_sndctrlpipe(dev, 0);
+ sc->recv_ctrl_pipe = usb_rcvctrlpipe(dev, 0);
+ sc->send_bulk_pipe = usb_sndbulkpipe(dev,
+ ep_out->bEndpointAddress & USB_ENDPOINT_NUMBER_MASK);
+ sc->recv_bulk_pipe = usb_rcvbulkpipe(dev,
+ ep_in->bEndpointAddress & USB_ENDPOINT_NUMBER_MASK);
+
+ return 0;
+}
+
+/*
+ * Probing is done in the process context, which allows us to cheat
+ * and not to build a state machine for the discovery.
+ */
+static int ub_probe(struct usb_interface *intf,
+ const struct usb_device_id *id)
+{
+ struct ub_dev *sc;
+ request_queue_t *q;
+ struct gendisk *disk;
+ int rc;
+
+ rc = -EDOM;
+ if (ub_host_id >= 26) /* XXX Lame naming scheme */
+ goto err_over;
+
+ rc = -ENOMEM;
+ if ((sc = kmalloc(sizeof(struct ub_dev), GFP_KERNEL)) == NULL)
+ goto err_core;
+ memset(sc, 0, sizeof(struct ub_dev));
+ spin_lock_init(&sc->lock);
+ sc->id = ub_host_id;
+ snprintf(sc->name, 8, "ub%c", sc->id + 'a');
+ usb_init_urb(&sc->work_urb);
+ tasklet_init(&sc->urb_tasklet, ub_scsi_urb_action, (unsigned long)sc);
+
+ sc->dev = interface_to_usbdev(intf);
+ // sc->intf = intf;
+ // sc->ifnum = intf->cur_altsetting->desc.bInterfaceNumber;
+
+ usb_set_intfdata(intf, sc);
+ usb_get_dev(sc->dev);
+
+ /* XXX Verify that we can handle the device (from descriptors) */
+
+ ub_get_pipes(sc, sc->dev, intf);
+
+ /*
+ * At this point, all USB initialization is done, do upper layer.
+ * We really hate halfway initialized structures, so from the
+ * invariants perspective, this ub_dev is fully constructed at
+ * this point.
+ */
+
+ if ((rc = register_blkdev(UB_MAJOR, sc->name)) != 0)
+ goto err_regblkdev;
+
+ devfs_mk_dir("ub");
+
+ if ((rc = ub_probe_read_cap(sc, &sc->capacity)) != 0) {
+ /* XXX Retry does actually happen, our code must be defective */
+ if ((rc = ub_probe_read_cap(sc, &sc->capacity)) != 0) {
+ sc->capacity.nsec = 100;
+ sc->capacity.bsize = 512;
+ sc->capacity.bshift = 0;
+ }
+ }
+ /* This is pretty much a long term P3 */
+ printk(KERN_INFO DRV_NAME ": device %u capacity nsec %ld bsize %u\n",
+ sc->dev->devnum, sc->capacity.nsec, sc->capacity.bsize);
+
+ /*
+ * Just one disk per sc currently, but maybe more.
+ */
+ rc = -ENOMEM;
+ if ((disk = alloc_disk(UB_MINORS_PER_MAJOR)) == NULL)
+ goto err_diskalloc;
+
+ sc->disk = disk;
+ sprintf(disk->disk_name, "ub%c", sc->id + 'a');
+ sprintf(disk->devfs_name, "ub/%c", sc->id + 'a');
+ disk->major = UB_MAJOR;
+ disk->first_minor = 0;
+ disk->fops = &ub_bd_fops;
+ disk->private_data = sc;
+
+ rc = -ENOMEM;
+ if ((q = blk_init_queue(ub_bd_rq_fn, &sc->lock)) == NULL)
+ goto err_blkqinit;
+
+ disk->queue = q;
+
+ // blk_queue_bounce_limit(q, hba[i]->pdev->dma_mask);
+ blk_queue_max_hw_segments(q, UB_MAX_REQ_SG);
+ blk_queue_max_phys_segments(q, UB_MAX_REQ_SG);
+ // blk_queue_segment_boundary(q, CARM_SG_BOUNDARY);
+ // blk_queue_max_sectors(q, 512);
+ // blk_queue_hardsect_size(q, xxxxx);
+
+ /*
+ * This is a serious infraction, caused by a deficiency in the
+ * USB sg interface (usb_sg_wait()). We plan to remove this once
+ * we get mileage on the driver and can justify a change to USB API.
+ * See blk_queue_bounce_limit() to understand this part.
+ */
+ q->bounce_pfn = blk_max_low_pfn;
+ q->bounce_gfp = GFP_NOIO;
+
+ q->queuedata = sc;
+
+ set_capacity(disk, sc->capacity.nsec);
+ add_disk(disk);
+
+ /* Upper level is all done, doing inits on a fully functional device */
+
+ ub_sysdiag_create(sc);
+
+ ub_host_id++; /* Protected by dev->serialize in USB core */
+ return 0;
+
+err_blkqinit:
+ put_disk(disk);
+err_diskalloc:
+ devfs_remove("ub");
+ unregister_blkdev(UB_MAJOR, sc->name);
+err_regblkdev:
+ // blk_cleanup_queue(sc->oob_q);
+// err_ooballoc:
+ usb_set_intfdata(intf, NULL);
+ usb_put_dev(sc->dev);
+ kfree(sc);
+err_core:
+err_over:
+ return rc;
+}
+
+static void ub_disconnect(struct usb_interface *intf)
+{
+ struct ub_dev *sc = usb_get_intfdata(intf);
+ struct gendisk *disk = sc->disk;
+ request_queue_t *q = disk->queue;
+ unsigned long flags;
+
+ /*
+ * Fence stall clearnings, operations triggered by unlinkings and so on.
+ */
+ spin_lock_irqsave(&sc->lock, flags);
+ sc->poison = 1;
+ spin_unlock_irqrestore(&sc->lock, flags);
+
+ /*
+ * Unregister the upper layer, this waits for all commands to end.
+ * XXX Does it, actually? Verify!
+ */
+ if (disk->flags & GENHD_FL_UP)
+ del_gendisk(disk);
+ if (q)
+ blk_cleanup_queue(q);
+ put_disk(disk);
+ sc->disk = NULL;
+
+ devfs_remove("ub");
+ unregister_blkdev(UB_MAJOR, sc->name);
+ // blk_cleanup_queue(sc->oob_q);
+
+ /*
+ * At this point there must be no commands coming from anyone
+ * and no URBs left in transit.
+ */
+
+ usb_set_intfdata(intf, NULL);
+ usb_put_dev(sc->dev);
+ kfree(sc);
+}
+
+struct usb_driver ub_driver = {
+ .owner = THIS_MODULE,
+ .name = "ub",
+ .probe = ub_probe,
+ .disconnect = ub_disconnect,
+ .id_table = ub_usb_ids,
+};
+
+static int __init ub_init(void)
+{
+
+ /* P3 */ printk("ub: sizeof ub_scsi_cmd %u ub_dev %u\n",
+ sizeof(struct ub_scsi_cmd), sizeof(struct ub_dev));
+ return usb_register(&ub_driver);
+}
+
+static void __exit ub_exit(void)
+{
+ usb_deregister(&ub_driver);
+}
+
+module_init(ub_init);
+module_exit(ub_exit);
+
+MODULE_LICENSE("GPL");
diff -urpN -X dontdiff linux-2.6.7/drivers/usb/storage/unusual_devs.h linux-2.6.7-ub/drivers/usb/storage/unusual_devs.h
--- linux-2.6.7/drivers/usb/storage/unusual_devs.h 2004-06-16 16:53:59.000000000 -0700
+++ linux-2.6.7-ub/drivers/usb/storage/unusual_devs.h 2004-06-16 17:47:14.000000000 -0700
@@ -486,11 +486,13 @@ UNUSUAL_DEV( 0x0781, 0x0001, 0x0200, 0x
US_SC_SCSI, US_PR_CB, NULL,
US_FL_SINGLE_LUN ),

+#if 0 /* passed over to ub */
UNUSUAL_DEV( 0x0781, 0x0002, 0x0009, 0x0009,
"Sandisk",
"ImageMate SDDR-31",
US_SC_DEVICE, US_PR_DEVICE, NULL,
US_FL_IGNORE_SER ),
+#endif

UNUSUAL_DEV( 0x0781, 0x0100, 0x0100, 0x0100,
"Sandisk",
diff -urpN -X dontdiff linux-2.6.7/drivers/usb/storage/usb.c linux-2.6.7-ub/drivers/usb/storage/usb.c
--- linux-2.6.7/drivers/usb/storage/usb.c 2004-06-16 16:53:59.000000000 -0700
+++ linux-2.6.7-ub/drivers/usb/storage/usb.c 2004-06-16 17:47:14.000000000 -0700
@@ -133,7 +133,6 @@ static struct usb_device_id storage_usb_
{ USB_INTERFACE_INFO(USB_CLASS_MASS_STORAGE, US_SC_QIC, US_PR_BULK) },
{ USB_INTERFACE_INFO(USB_CLASS_MASS_STORAGE, US_SC_UFI, US_PR_BULK) },
{ USB_INTERFACE_INFO(USB_CLASS_MASS_STORAGE, US_SC_8070, US_PR_BULK) },
- { USB_INTERFACE_INFO(USB_CLASS_MASS_STORAGE, US_SC_SCSI, US_PR_BULK) },

/* Terminating entry */
{ }
diff -urpN -X dontdiff linux-2.6.7/include/linux/major.h linux-2.6.7-ub/include/linux/major.h
--- linux-2.6.7/include/linux/major.h 2004-04-21 12:01:24.000000000 -0700
+++ linux-2.6.7-ub/include/linux/major.h 2004-06-16 17:47:14.000000000 -0700
@@ -165,4 +165,6 @@

#define VIOTAPE_MAJOR 230

+#define UB_MAJOR 125 /* USB Block - Experimental XXX */
+
#endif


2004-06-26 20:13:37

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, Jun 26, 2004 at 01:06:45PM -0700, Pete Zaitcev wrote:
> Hi, guys,
>
> I have drafted up an implementation of a USB storage driver as I wish
> it done (called "ub"). The main goal of the project is to produce a driver
> which never oopses, and above all, never locks up the machine. To this
> point I did all my debugging while running X11 and yapping on IRC. If this
> goal requires to sacrifice performance, so be it. This is how ub differs
> from the usb-storage.
>
> The current usb-storage works quite well on servers where netdump can
> be brought to bear, but on desktop its debuggability leaves some room
> for improvement. In all other respects, it is superior to ub. Since
> characteristics of usb-storage and ub are different I expect them to
> coexist potentially indefinitely (if ub finds any use at all).

Would I be correct in the following assessments:
(1) UB only supports direct-access type devices
(2) UB only supports 'transparent scsi' devices
(3) UB only supports 'bulk-only transport' devices

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

It was a new hope.
-- Dust Puppy
User Friendly, 12/25/1998


Attachments:
(No filename) (1.19 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-26 20:34:31

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c


> +/* command block wrapper */
> +struct bulk_cb_wrap {
> + u32 Signature; /* contains 'USBC' */
> + u32 Tag; /* unique per command id */
> + u32 DataTransferLength; /* size of data */
> + u8 Flags; /* direction in bit 0 */
> + u8 Lun; /* LUN normally 0 */
> + u8 Length; /* of of the CDB */
> + u8 CDB[MAX_CDB_SIZE]; /* max command */
> +};

should be packed attributed

> +#define US_BULK_CB_WRAP_LEN 31
> +#define US_BULK_CB_SIGN 0x43425355 /*spells out USBC */
> +#define US_BULK_FLAG_IN 1
> +#define US_BULK_FLAG_OUT 0
> +
> +/* command status wrapper */
> +struct bulk_cs_wrap {
> + u32 Signature; /* should = 'USBS' */
> + u32 Tag; /* same as original command */
> + u32 Residue; /* amount not transferred */
> + u8 Status; /* see below */
> +};

should be packed attributed

> +#define US_BULK_CS_WRAP_LEN 13
> +#define US_BULK_CS_SIGN 0x53425355 /* spells out 'USBS' */
> +/* This is for Olympus Camedia digital cameras */
> +#define US_BULK_CS_OLYMPUS_SIGN 0x55425355 /* spells out 'USBU' */
> +#define US_BULK_STAT_OK 0
> +#define US_BULK_STAT_FAIL 1
> +#define US_BULK_STAT_PHASE 2
> +
> +/* bulk-only class specific requests */
> +#define US_BULK_RESET_REQUEST 0xff
> +#define US_BULK_GET_MAX_LUN 0xfe
> +
> +/*
> + */
> +struct ub_dev;
> +
> +#define UB_MAX_REQ_SG 1
> +
> +/*
> + * An instance of a SCSI command in transit.
> + */
> +#define UB_DIR_NONE 0
> +#define UB_DIR_READ 1
> +#define UB_DIR_ILLEGAL2 2
> +#define UB_DIR_WRITE 3
> +
> +enum ub_scsi_cmd_state {
> + UB_CMDST_INIT, /* Initial state */
> + UB_CMDST_CMD, /* Command submitted */
> + UB_CMDST_DATA, /* Data phase */
> + UB_CMDST_CLR2STS, /* Clearing before requesting status */
> + UB_CMDST_STAT, /* Status phase */
> + UB_CMDST_CLEAR, /* Clearing a stall (halt, actually) */
> + UB_CMDST_SENSE, /* Sending Request Sense */
> + UB_CMDST_DONE /* Final state */
> +};
> +
> +struct ub_scsi_cmd {
> + unsigned char cdb[MAX_CDB_SIZE];
> + unsigned char cdb_len;
> +
> + unsigned char dir; /* 0 - none, 1 - read, 3 - write. */
> + enum ub_scsi_cmd_state state;
> + unsigned int tag;
> +
> + int error; /* Return code - valid upon done */
> + int act_len; /* Return size */
> +
> + int stat_count; /* Retries getting status. */
> +
> + /*
> + * We do not support transfers from highmem pages
> + * because the underlying USB framework does not.
> + *
> + * XXX Actually, there is a (very fresh and buggy) support
> + * for transfers under control of struct scatterlist, usb_map_sg()
> + * in 2.6.6, but it seems to have issues with highmem.
> + */
> + char *data; /* Requested buffer */
> + unsigned int len; /* Requested length */
> + // struct scatterlist sgv[UB_MAX_REQ_SG];
> +
> + void (*done)(struct ub_dev *, struct ub_scsi_cmd *);
> + void *back;
> +};
> +
> +/*
> + */
> +struct ub_capacity {
> + unsigned long nsec; /* Linux size - 512 byte sectors */
> + unsigned int bsize; /* Linux hardsect_size */
> + unsigned int bshift; /* Shift between 512 and hard sects */
> +};
> +
> +/*
> + * The SCSI command tracing structure.
> + */
> +
> +#define SCMD_ST_HIST_SZ 8
> +#define SCMD_TRACE_SZ 5
> +
> +struct ub_scsi_cmd_trace {
> + unsigned int tag;
> + unsigned char op;
> + unsigned char dir;
> + unsigned int req_size, act_size;
> + int hcur;
> + enum ub_scsi_cmd_state st_hst[SCMD_ST_HIST_SZ]; /* u8? */
> +};
> +
> +struct ub_scsi_trace {
> + int cur;
> + struct ub_scsi_cmd_trace vec[SCMD_TRACE_SZ];
> +};
> +
> +/*
> + * The UB device instance.
> + */
> +struct ub_dev {
> + spinlock_t lock;
> + unsigned int id; /* Number among ub's */
> +
> + unsigned int tagcnt;
> + int poison;
> + char name[8];
> + struct usb_device *dev;
> + struct usb_interface *intf;
> +
> + struct ub_capacity capacity;
> + struct gendisk *disk;
> +
> + unsigned int send_bulk_pipe; /* cached pipe values */
> + unsigned int recv_bulk_pipe;
> + unsigned int send_ctrl_pipe;
> + unsigned int recv_ctrl_pipe;
> +
> + struct tasklet_struct urb_tasklet;
> +
> + /* XXX Use Ingo's mempool (once we have more than one) */
> + int cmda[1];
> + struct ub_scsi_cmd cmdv[1];
> +
> + int busy;
> + struct ub_scsi_cmd *top_cmd; /* XXX Under ->busy until we have a queue */
> + struct ub_scsi_cmd top_rqs_cmd; /* REQUEST SENSE */
> + unsigned char top_sense[SENSE_SIZE];
> +
> + struct urb work_urb;
> + int last_pipe; /* What might need clearing */
> + struct bulk_cb_wrap work_bcb;
> + struct bulk_cs_wrap work_bcs;
> + struct usb_ctrlrequest work_cr;
> +
> + struct ub_scsi_trace tr;
> +};
> +
> +/*
> + */
> +static void ub_rw_cmd_done(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
> +static void ub_end_rq(struct request *rq, int uptodate);
> +static int ub_submit_scsi(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
> +static void ub_scsi_urb_complete(struct urb *urb, struct pt_regs *pt);
> +static void ub_scsi_urb_action(unsigned long _dev);
> +static void ub_scsi_urb_compl(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
> +static void ub_state_stat(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
> +static int ub_submit_clear_stall(struct ub_dev *sc, struct ub_scsi_cmd *cmd,
> + int stalled_pipe);
> +static int ub_submit_top_sense(struct ub_dev *sc, struct ub_scsi_cmd *cmd);
> +
> +/*
> + */
> +static struct usb_device_id ub_usb_ids[] = {
> + { USB_DEVICE_VER(0x0781, 0x0002, 0x0009, 0x0009) }, /* SDDR-31 */
> + { }
> +};
> +
> +MODULE_DEVICE_TABLE(usb, ub_usb_ids);
> +
> +static unsigned int ub_host_id;
> +
> +/*
> + * The SCSI command tracing procedures.
> + */
> +
> +static void ub_cmdtr_new(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> +{
> + int n;
> + struct ub_scsi_cmd_trace *t;
> +
> + if ((n = sc->tr.cur + 1) == SCMD_TRACE_SZ) n = 0;
> + t = &sc->tr.vec[n];
> +
> + memset(t, 0, sizeof(struct ub_scsi_cmd_trace));
> + t->tag = cmd->tag;
> + t->op = cmd->cdb[0];
> + t->dir = cmd->dir;
> + t->req_size = cmd->len;
> + t->st_hst[0] = cmd->state;
> +
> + sc->tr.cur = n;
> +}
> +
> +static void ub_cmdtr_state(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> +{
> + int n;
> + struct ub_scsi_cmd_trace *t;
> +
> + t = &sc->tr.vec[sc->tr.cur];
> + if ((n = t->hcur + 1) == SCMD_ST_HIST_SZ) n = 0;
> + t->st_hst[n] = cmd->state;
> + t->hcur = n;
> +}
> +
> +static void ub_cmdtr_act_len(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> +{
> + struct ub_scsi_cmd_trace *t;
> +
> + t = &sc->tr.vec[sc->tr.cur];
> + t->act_size = cmd->act_len;
> +}
> +
> +/* The struct disk_attribute must match drivers/block/genhd.c */
> +struct disk_attribute {
> + struct attribute attr;
> + ssize_t (*show)(struct gendisk *, char *);
> +};
> +
> +static ssize_t ub_sysdiag_show(struct gendisk *disk, char *page)
> +{
> + struct ub_dev *sc = disk->private_data;
> + int cnt = 0;
> + unsigned long flags;
> + int nc, nh;
> + int i, j;
> + struct ub_scsi_cmd_trace *t;
> +
> + spin_lock_irqsave(&sc->lock, flags);
> + if ((nc = sc->tr.cur + 1) == SCMD_TRACE_SZ) nc = 0;
> + for (j = 0; j < SCMD_TRACE_SZ; j++) {
> + t = &sc->tr.vec[nc];
> +
> + cnt += sprintf(page + cnt, "%08x %02x", t->tag, t->op);
> + cnt += sprintf(page + cnt, " %d", t->dir);
> + cnt += sprintf(page + cnt, " [%5d %5d]", t->req_size, t->act_size);
> + if ((nh = t->hcur + 1) == SCMD_ST_HIST_SZ) nh = 0;
> + for (i = 0; i < SCMD_ST_HIST_SZ; i++) {
> + cnt += sprintf(page + cnt, " %d", t->st_hst[nh]);
> + if (++nh == SCMD_ST_HIST_SZ) nh = 0;
> + }
> + cnt += sprintf(page + cnt, "\n");
> +
> + if (++nc == SCMD_TRACE_SZ) nc = 0;
> + }
> + spin_unlock_irqrestore(&sc->lock, flags);
> + return cnt;
> +}
> +
> +static struct disk_attribute ub_sysdiag_attr = {
> + .attr = {.name = "diag", .mode = S_IRUGO },
> + .show = ub_sysdiag_show
> +};
> +
> +static int ub_sysdiag_create(struct ub_dev *sc)
> +{
> + return sysfs_create_file(&sc->disk->kobj, &ub_sysdiag_attr.attr);
> +}
> +
> +/*
> + * The "command allocator".
> + */
> +static struct ub_scsi_cmd *ub_get_cmd(struct ub_dev *sc)
> +{
> + struct ub_scsi_cmd *ret;
> +
> + if (sc->cmda[0])
> + return NULL;
> + ret = &sc->cmdv[0];
> + sc->cmda[0] = 1;
> + return ret;
> +}
> +
> +static void ub_put_cmd(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> +{
> + if (cmd != &sc->cmdv[0]) {
> + printk(KERN_WARNING DRV_NAME ": "
> + "releasing a foreign cmd %p\n", cmd);
> + return;
> + }
> + if (!sc->cmda[0]) {
> + printk(KERN_WARNING DRV_NAME ": "
> + "releasing a free cmd\n");
> + return;
> + }
> + sc->cmda[0] = 0;
> +}
> +
> +/*
> + * The request function is our main entry point
> + */
> +
> +static inline int ub_bd_rq_fn_1(request_queue_t *q)
> +{
> +#if 0
> + int writing = 0, pci_dir, i, n_elem;
> + u32 tmp;
> + unsigned int msg_size;
> +#endif
> + struct ub_dev *sc = q->queuedata;
> + struct request *rq;
> +#if 0 /* We use rq->buffer for now */
> + struct scatterlist *sg;
> + int n_elem;
> +#endif
> + struct ub_scsi_cmd *cmd;
> + int ub_dir;
> + unsigned int block, nblks;
> + int rc;
> +
> + if ((rq = elv_next_request(q)) == NULL)
> + return 1;
> +
> + if ((cmd = ub_get_cmd(sc)) == NULL) {
> + blk_stop_queue(q);
> + return 1;
> + }
> +
> + blkdev_dequeue_request(rq);
> +
> + if (rq_data_dir(rq) == WRITE)
> + ub_dir = UB_DIR_WRITE;
> + else
> + ub_dir = UB_DIR_READ;
> +
> + /*
> + * get scatterlist from block layer
> + */
> +#if 0 /* We use rq->buffer for now */
> + sg = &cmd->sgv[0];
> + n_elem = blk_rq_map_sg(q, rq, sg);
> + if (n_elem <= 0) {
> + ub_put_cmd(sc, cmd);
> + ub_end_rq(rq, 0);
> + blk_start_queue(q);
> + return 0; /* request with no s/g entries? */
> + }
> +
> + if (n_elem != 1) { /* Paranoia */
> + printk(KERN_WARNING DRV_NAME ": request with %d segments\n",
> + n_elem);
> + ub_put_cmd(sc, cmd);
> + ub_end_rq(rq, 0);
> + blk_start_queue(q);
> + return 0;
> + }
> +#endif
> + /*
> + * XXX Unfortunately, this check does not work. It is quite possible
> + * to get bogus non-null rq->buffer if you allow sg by mistake.
> + */
> + if (rq->buffer == NULL) {
> + /*
> + * This must not happen if we set the queue right.
> + * The block level must create bounce buffers for us.
> + */
> + static int do_print = 1;
> + if (do_print) {
> + printk(KERN_WARNING DRV_NAME ": unmapped request\n");
> + do_print = 0;
> + }
> + ub_put_cmd(sc, cmd);
> + ub_end_rq(rq, 0);
> + blk_start_queue(q);
> + return 0;
> + }
> +
> +#if 0 /* Just to show what carmel does in this place... We won't need this. */
> + /* map scatterlist to PCI bus addresses */
> + n_elem = pci_map_sg(host->pdev, sg, n_elem, pci_dir);
> + if (n_elem <= 0) {
> + carm_end_rq(host, crq, 0);
> + return 1; /* request with no s/g entries? */
> + }
> + crq->n_elem = n_elem;
> + crq->port = port;
> + host->hw_sg_used += n_elem;
> +#endif
> +
> + /*
> + * build the command
> + */
> + block = rq->sector;
> + nblks = rq->nr_sectors;
> +
> + memset(cmd, 0, sizeof(struct ub_scsi_cmd));
> + cmd->cdb[0] = (ub_dir == UB_DIR_READ)? READ_10: WRITE_10;
> + /* 10-byte uses 4 bytes of LBA: 2147483648KB, 2097152MB, 2048GB */
> + cmd->cdb[2] = block >> 24;
> + cmd->cdb[3] = block >> 16;
> + cmd->cdb[4] = block >> 8;
> + cmd->cdb[5] = block;

we have macros for that.

> + cmd->cdb[7] = nblks >> 8;
> + cmd->cdb[8] = nblks;

dito

Regards
Oliver

2004-06-26 21:42:42

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004 22:35:15 +0200
Oliver Neukum <[email protected]> wrote:

>
> > +/* command block wrapper */
> > +struct bulk_cb_wrap {
> > + u32 Signature; /* contains 'USBC' */
> > + u32 Tag; /* unique per command id */
> > + u32 DataTransferLength; /* size of data */
> > + u8 Flags; /* direction in bit 0 */
> > + u8 Lun; /* LUN normally 0 */
> > + u8 Length; /* of of the CDB */
> > + u8 CDB[MAX_CDB_SIZE]; /* max command */
> > +};
>
> should be packed attributed

Only when necessary, and I do not belive any platform Linux supports
needs it in this case.

Packed attributing is _BAD_ when it isn't needed because it forces GCC
to output multiple byte-sized loads in order to access larger than
byte-sized elements because it cannot assume the proper alignment on RISC
systems.

2004-06-26 21:56:04

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Samstag, 26. Juni 2004 23:41 schrieb David S. Miller:
> On Sat, 26 Jun 2004 22:35:15 +0200
> Oliver Neukum <[email protected]> wrote:
>
> >
> > > +/* command block wrapper */
> > > +struct bulk_cb_wrap {
> > > + u32 Signature; /* contains 'USBC' */
> > > + u32 Tag; /* unique per command id */
> > > + u32 DataTransferLength; /* size of data */
> > > + u8 Flags; /* direction in bit 0 */
> > > + u8 Lun; /* LUN normally 0 */
> > > + u8 Length; /* of of the CDB */
> > > + u8 CDB[MAX_CDB_SIZE]; /* max command */
> > > +};
> >
> > should be packed attributed
>
> Only when necessary, and I do not belive any platform Linux supports
> needs it in this case.
>
> Packed attributing is _BAD_ when it isn't needed because it forces GCC
> to output multiple byte-sized loads in order to access larger than
> byte-sized elements because it cannot assume the proper alignment on RISC
> systems.

Unless I am mistaken, this structure is transfered as such over the bus,
so IMHO here it is needed.

Regards
Oliver

2004-06-26 22:07:59

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004 23:56:49 +0200
Oliver Neukum <[email protected]> wrote:

> Unless I am mistaken, this structure is transfered as such over the bus,
> so IMHO here it is needed.

That is not the only criterious that needs to be met in order for
the packed attribute to be required.

It is needed only if the structure elements are such that gcc will
not packet them properly on all supported architecutures. Peter's
example in his code will be packed properly without the packed
attribute to the best of my knowledge.

2004-06-26 22:35:19

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 00:07 schrieb David S. Miller:
> On Sat, 26 Jun 2004 23:56:49 +0200
> Oliver Neukum <[email protected]> wrote:
>
> > Unless I am mistaken, this structure is transfered as such over the bus,
> > so IMHO here it is needed.
>
> That is not the only criterious that needs to be met in order for
> the packed attribute to be required.
>
> It is needed only if the structure elements are such that gcc will
> not packet them properly on all supported architecutures. Peter's
> example in his code will be packed properly without the packed
> attribute to the best of my knowledge.

So either it has no effect or it is needed?
Then why take the risk that gcc is changed or an architecture added that
needs it? It seems to be cleaner to me to mark data structures that
must be layed out as specified specially. Safer, too, just in case.

Regards
Oliver

2004-06-26 22:45:53

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Samstag, 26. Juni 2004 22:06 schrieb Pete Zaitcev:
> +static int ub_submit_top_sense(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> +{
> +???????struct ub_scsi_cmd *scmd;
> +
> +???????scmd = &sc->top_rqs_cmd;
> +
> +???????/* XXX Can be done at init */
> +???????scmd->cdb[0] = REQUEST_SENSE;
> +???????scmd->cdb_len = 6;
> +???????scmd->dir = UB_DIR_READ;
> +???????scmd->state = UB_CMDST_INIT;
> +???????scmd->data = sc->top_sense;

You must allocate a separate buffer to the sense data.
We had similar code in hid which leads to data corruption
on some architectures. It's an issue of DMA coherency.

> +???????scmd->len = SENSE_SIZE;
> +???????scmd->done = ub_top_sense_done;
> +???????scmd->back = cmd;
> +
> +???????scmd->tag = sc->tagcnt++;
> +???????return ub_submit_scsi(sc, scmd);
> +}

Regards
Oliver

2004-06-26 22:54:41

by Andries Brouwer

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, Jun 26, 2004 at 10:35:15PM +0200, Oliver Neukum wrote:

> > + cmd->cdb[2] = block >> 24;
> > + cmd->cdb[3] = block >> 16;
> > + cmd->cdb[4] = block >> 8;
> > + cmd->cdb[5] = block;
>
> we have macros for that.
>
> > + cmd->cdb[7] = nblks >> 8;
> > + cmd->cdb[8] = nblks;
>
> dito

Yes, we have macros. Using those macros would not at all be an improvement here.

Andries

2004-06-26 22:58:05

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 00:54 schrieb Andries Brouwer:
> On Sat, Jun 26, 2004 at 10:35:15PM +0200, Oliver Neukum wrote:
>
> > > + cmd->cdb[2] = block >> 24;
> > > + cmd->cdb[3] = block >> 16;
> > > + cmd->cdb[4] = block >> 8;
> > > + cmd->cdb[5] = block;
> >
> > we have macros for that.
> >
> > > + cmd->cdb[7] = nblks >> 8;
> > > + cmd->cdb[8] = nblks;
> >
> > dito
>
> Yes, we have macros. Using those macros would not at all be an improvement here.

How do you arrive at that unusual conclusion?

Regards
Oliver

2004-06-26 23:08:08

by Andries Brouwer

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 12:59:04AM +0200, Oliver Neukum wrote:

>>>>+ cmd->cdb[2] = block >> 24;
>>>>+ cmd->cdb[3] = block >> 16;
>>>>+ cmd->cdb[4] = block >> 8;
>>>>+ cmd->cdb[5] = block;
>>>
>>> we have macros for that.
>>>
>>>>+ cmd->cdb[7] = nblks >> 8;
>>>>+ cmd->cdb[8] = nblks;
>>>
>>> dito
>>
>> Yes, we have macros. Using those macros would not at all be an improvement here.
>
> How do you arrive at that unusual conclusion?

The above writes clearly and simply what one wants.
I expect that you propose writing

*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);

or some similar unspeakable ugliness.
If you had something else in mind, please reveal what.

2004-06-26 23:21:19

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 00:36:14 +0200
Oliver Neukum <[email protected]> wrote:

> So either it has no effect or it is needed?
> Then why take the risk that gcc is changed or an architecture added that
> needs it? It seems to be cleaner to me to mark data structures that
> must be layed out as specified specially. Safer, too, just in case.

Because it generates enormously inefficient code on RISC
platforms. It turns simple word loads like:

ld [%addr], %reg

into something like:

ldub [%addr + 0], %tmp1
ldub [%addr + 1], %tmp2
ldub [%addr + 2], %tmp3
ldub [%addr + 3], %tmp4
sll %tmp1, 24, %tmp1
sll %tmp2, 16, %tmp2
sll %tmp3, 8, %tmp3
or %tmp1, %tmp2, %tmp1
or %tmp1, %tmp3, %tmp1
or %tmp1, %tmp4, %reg

A ten-fold increase in code size just to access any member
of the structure.

I think you have no idea how astronomically inefficient the code is
which gets generated when you add the packed attribute to a structure.

2004-06-27 02:08:38

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004 13:12:25 -0700
Matthew Dharm <[email protected]> wrote:

> Would I be correct in the following assessments:
> (1) UB only supports direct-access type devices
> (2) UB only supports 'transparent scsi' devices
> (3) UB only supports 'bulk-only transport' devices

Yes, you would. Someone mentioned on usb-storage list that Windows supports
two things only without extra drivers: Transparent SCSI over Bulk, and UFI.
IIRC, it was Pat. So, I thought maybe to tackle drivers/block/ufi.c later,
to share design concept and some libraries with ub. The rest stays with
usb-storage. But the real life always introduces corrections to plans, so
let's not get too far ahead.

-- Pete

2004-06-27 03:30:49

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, Jun 26, 2004 at 07:08:27PM -0700, Pete Zaitcev wrote:
> On Sat, 26 Jun 2004 13:12:25 -0700
> Matthew Dharm <[email protected]> wrote:
>
> > Would I be correct in the following assessments:
> > (1) UB only supports direct-access type devices
> > (2) UB only supports 'transparent scsi' devices
> > (3) UB only supports 'bulk-only transport' devices
>
> Yes, you would. Someone mentioned on usb-storage list that Windows supports
> two things only without extra drivers: Transparent SCSI over Bulk, and UFI.
> IIRC, it was Pat. So, I thought maybe to tackle drivers/block/ufi.c later,
> to share design concept and some libraries with ub. The rest stays with
> usb-storage. But the real life always introduces corrections to plans, so
> let's not get too far ahead.

Do you have a plan for non "direct-access" devices?

CD-ROMs are common, but tapes exist also.

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

Ye gods! I have feet??!
-- Dust Puppy
User Friendly, 12/4/1997


Attachments:
(No filename) (1.06 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-27 03:52:22

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004, Oliver Neukum wrote:

> Am Samstag, 26. Juni 2004 22:06 schrieb Pete Zaitcev:
> > +static int ub_submit_top_sense(struct ub_dev *sc, struct ub_scsi_cmd *cmd)
> > +{
> > +???????struct ub_scsi_cmd *scmd;
> > +
> > +???????scmd = &sc->top_rqs_cmd;
> > +
> > +???????/* XXX Can be done at init */
> > +???????scmd->cdb[0] = REQUEST_SENSE;
> > +???????scmd->cdb_len = 6;
> > +???????scmd->dir = UB_DIR_READ;
> > +???????scmd->state = UB_CMDST_INIT;
> > +???????scmd->data = sc->top_sense;
>
> You must allocate a separate buffer to the sense data.
> We had similar code in hid which leads to data corruption
> on some architectures. It's an issue of DMA coherency.

I mentioned this some time ago to the SCSI maintainers. They felt that it
wasn't necessary to allocate a separate buffer for the sense data -- I'm
not sure why. Apparently most if not all SCSI drivers fail to do this.
Usb-storage doesn't do it either; maybe we should.

Alan Stern

2004-06-27 04:05:30

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004, Pete Zaitcev wrote:

> Hi, guys,
>
> I have drafted up an implementation of a USB storage driver as I wish
> it done (called "ub"). The main goal of the project is to produce a driver
> which never oopses, and above all, never locks up the machine. To this
> point I did all my debugging while running X11 and yapping on IRC. If this
> goal requires to sacrifice performance, so be it. This is how ub differs
> from the usb-storage.

I would say the major difference is that ub implements a block device
interface directly whereas usb-storage relies on the SCSI layer for that
purpose. (Not to mention that usb-storage works with devices that aren't
of the direct-access type...)

> The current usb-storage works quite well on servers where netdump can
> be brought to bear, but on desktop its debuggability leaves some room
> for improvement.

I agree that the debug logging in usb-storage is not good. A worthwhile
improvement would be to log only commands that fail or get an error, with
the logging selected by the normal USB debugging (not usb-storage verbose
debugging) configuration option. Matt, what do you think?

> In all other respects, it is superior to ub. Since
> characteristics of usb-storage and ub are different I expect them to
> coexist potentially indefinitely (if ub finds any use at all).

It look like you are targeting ub for Linux 2.4. Do you intend to use it
with 2.6? An important difference between the two kernel versions is that
in 2.6 we do not try to make devices persistent across disconnections by
recognizing some type of unique ID.

Alan Stern

2004-06-27 04:30:43

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 01:20 schrieb David S. Miller:
> A ten-fold increase in code size just to access any member
> of the structure.
>
> I think you have no idea how astronomically inefficient the code is
> which gets generated when you add the packed attribute to a structure.

Are you saying that gcc will generate other code with packed even if
packed does not change the layout of the structure in question?

Regards
Oliver

2004-06-27 05:03:39

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 01:08 schrieb Andries Brouwer:
> >> Yes, we have macros. Using those macros would not at all be an improvement here.
> >
> > How do you arrive at that unusual conclusion?
>
> The above writes clearly and simply what one wants.
> I expect that you propose writing
>
> ????????*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);
>
> or some similar unspeakable ugliness.
> If you had something else in mind, please reveal what.

That "ugliness" has the unspeakable advantage of producing sane code
on big endian architectures.
If you want eye candy, then go and use a structured data type rather
than an array of bytes.

Regards
Oliver

2004-06-27 05:04:59

by Greg KH

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 12:05:22AM -0400, Alan Stern wrote:
> It look like you are targeting ub for Linux 2.4. Do you intend to use it
> with 2.6? An important difference between the two kernel versions is that
> in 2.6 we do not try to make devices persistent across disconnections by
> recognizing some type of unique ID.

The patch was against 2.6.7. Why do you think this is for 2.4?

greg k-h

2004-06-27 05:36:06

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 12:05:22AM -0400, Alan Stern wrote:
> On Sat, 26 Jun 2004, Pete Zaitcev wrote:
> > The current usb-storage works quite well on servers where netdump can
> > be brought to bear, but on desktop its debuggability leaves some room
> > for improvement.
>
> I agree that the debug logging in usb-storage is not good. A worthwhile
> improvement would be to log only commands that fail or get an error, with
> the logging selected by the normal USB debugging (not usb-storage verbose
> debugging) configuration option. Matt, what do you think?

This is an acknowledged weak point of usb-storage.

A 'log only on error' system might be good... but it's going to be a bit
tricky for two reasons:

1) We have to keep data around longer than currently, so we can log it if
something goes wrong later (so we see how we got to this point).

2) What counts as an error? We probably only want this to kick in when the
SCSI error-recovery does. Normal 'failed commands' probably shouldn't
count.

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

I want my GPFs!!!
-- Stef
User Friendly, 11/9/1998


Attachments:
(No filename) (1.17 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-27 06:35:40

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 06:31:40 +0200
Oliver Neukum <[email protected]> wrote:

> Am Sonntag, 27. Juni 2004 01:20 schrieb David S. Miller:
> > A ten-fold increase in code size just to access any member
> > of the structure.
> >
> > I think you have no idea how astronomically inefficient the code is
> > which gets generated when you add the packed attribute to a structure.
>
> Are you saying that gcc will generate other code with packed even if
> packed does not change the layout of the structure in question?

That's correct, because the packed attribute also means that the alignment
of any particular instance of the structure is not guanrenteed.

2004-06-27 10:42:11

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 08:34 schrieb David S. Miller:
> On Sun, 27 Jun 2004 06:31:40 +0200
> Oliver Neukum <[email protected]> wrote:
>
> > Am Sonntag, 27. Juni 2004 01:20 schrieb David S. Miller:
> > > A ten-fold increase in code size just to access any member
> > > of the structure.
> > >
> > > I think you have no idea how astronomically inefficient the code is
> > > which gets generated when you add the packed attribute to a structure.
> >
> > Are you saying that gcc will generate other code with packed even if
> > packed does not change the layout of the structure in question?
>
> That's correct, because the packed attribute also means that the alignment
> of any particular instance of the structure is not guanrenteed.

OK, then it shouldn't be used in this case. However, shouldn't we have
an attribute like __nopadding__ which does exactly that?

Regards
Oliver

2004-06-27 14:08:53

by Andries Brouwer

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 07:04:36AM +0200, Oliver Neukum wrote:
> > >> Yes, we have macros. Using those macros would not at all be an improvement here.
> > >
> > > How do you arrive at that unusual conclusion?
> >
> > The above writes clearly and simply what one wants.
> > I expect that you propose writing
> >
> > ????????*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);
> >
> > or some similar unspeakable ugliness.
> > If you had something else in mind, please reveal what.
>
> That "ugliness" has the unspeakable advantage of producing sane code
> on big endian architectures.

I am not so sure. It tells the compiler to do a 4-byte access
on an address that possibly is not 4-byte aligned.

2004-06-27 14:23:17

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 16:08 schrieb Andries Brouwer:
> On Sun, Jun 27, 2004 at 07:04:36AM +0200, Oliver Neukum wrote:
> > > >> Yes, we have macros. Using those macros would not at all be an improvement here.
> > > >
> > > > How do you arrive at that unusual conclusion?
> > >
> > > The above writes clearly and simply what one wants.
> > > I expect that you propose writing
> > >
> > > ????????*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);
> > >
> > > or some similar unspeakable ugliness.
> > > If you had something else in mind, please reveal what.
> >
> > That "ugliness" has the unspeakable advantage of producing sane code
> > on big endian architectures.
>
> I am not so sure. It tells the compiler to do a 4-byte access
> on an address that possibly is not 4-byte aligned.

We also have the unaligned family of macro. Probably the cleanest
solution would be a union to do away with the ugly casts that would
be needed.

Regards
Oliver

2004-06-27 15:19:42

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004, Oliver Neukum wrote:

> Am Sonntag, 27. Juni 2004 16:08 schrieb Andries Brouwer:
> > On Sun, Jun 27, 2004 at 07:04:36AM +0200, Oliver Neukum wrote:
> > > > >> Yes, we have macros. Using those macros would not at all be an improvement here.
> > > > >
> > > > > How do you arrive at that unusual conclusion?
> > > >
> > > > The above writes clearly and simply what one wants.
> > > > I expect that you propose writing
> > > >
> > > > ????????*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);
> > > >
> > > > or some similar unspeakable ugliness.
> > > > If you had something else in mind, please reveal what.
> > >
> > > That "ugliness" has the unspeakable advantage of producing sane code
> > > on big endian architectures.
> >
> > I am not so sure. It tells the compiler to do a 4-byte access
> > on an address that possibly is not 4-byte aligned.
>
> We also have the unaligned family of macro. Probably the cleanest
> solution would be a union to do away with the ugly casts that would
> be needed.

My favorite approach has always been:

put_be32(cmd->cdb + 2, block);

Unfortunately there is no such function or macro! It's easy to define an
inline function that would carry out the series of four single-byte
assignments that originally started this discussion. A more sophisticated
implementation would expand to Andries' unspeakably ugly code on
big-endian platforms that don't impose a large penalty for non-aligned
4-byte accesses. I leave it up to others to decide which is best on
little-endian platforms that can do unaligned accesses.

I think it would be great if some such utility routine were added to a
standard header in the kernel, together with its siblings put_le32(),
put_be16(), put_le16(), and the corresponding get_xxxx() functions.

Alan Stern

2004-06-27 15:23:13

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004, Greg KH wrote:

> On Sun, Jun 27, 2004 at 12:05:22AM -0400, Alan Stern wrote:
> > It look like you are targeting ub for Linux 2.4. Do you intend to use it
> > with 2.6? An important difference between the two kernel versions is that
> > in 2.6 we do not try to make devices persistent across disconnections by
> > recognizing some type of unique ID.
>
> The patch was against 2.6.7. Why do you think this is for 2.4?

So it is. I was mistaken because it was late at night, and I read this
entry in the to-do list:

+ * -- use serial numbers to hook onto same hosts (same minor) after
disconnect

That feature is present in usb-storage for 2.4 but not 2.6, so I assumed
there would be no reason to add it to ub for 2.6 either.

Alan Stern

2004-06-27 15:23:30

by Andries Brouwer

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 04:24:18PM +0200, Oliver Neukum wrote:

> > > > The above writes clearly and simply what one wants.
> > > > I expect that you propose writing
> > > >
> > > > ????????*((u32 *)(cmd->cdb + 2)) = cpu_to_be32(block);
> > > >
> > > > or some similar unspeakable ugliness.
> > > > If you had something else in mind, please reveal what.
> > >
> > > That "ugliness" has the unspeakable advantage of producing sane code
> > > on big endian architectures.
> >
> > I am not so sure. It tells the compiler to do a 4-byte access
> > on an address that possibly is not 4-byte aligned.
>
> We also have the unaligned family of macro. Probably the cleanest
> solution would be a union to do away with the ugly casts that would
> be needed.

You see what is happening. Pete writes simple straightforward correct
code. You want to replace it by ugly code that is perhaps not 100%
correct. Then the ugliness spreads - not only a local cast, but
globally the data structures must be adapted. And would it help? What
padding do you get in that union? Do the details depend on the gcc
version? On the compilation flags?

Always write simple direct obvious code. Avoid all casts.
Uglifying code burdens maintenance. It endangers correctness.
It is reasonable only when efficiency is really important,
when every nanosecond counts (and the ugly code is really faster).
That is not the case here.


Andries

2004-06-27 15:28:44

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, 26 Jun 2004, Matthew Dharm wrote:

> > I agree that the debug logging in usb-storage is not good. A worthwhile
> > improvement would be to log only commands that fail or get an error, with
> > the logging selected by the normal USB debugging (not usb-storage verbose
> > debugging) configuration option. Matt, what do you think?
>
> This is an acknowledged weak point of usb-storage.
>
> A 'log only on error' system might be good... but it's going to be a bit
> tricky for two reasons:
>
> 1) We have to keep data around longer than currently, so we can log it if
> something goes wrong later (so we see how we got to this point).

Yep. I just wonder if logging only the current command will be enough.
Sometimes it's important to see what other, successful commands preceded
the one that didn't work. Well, we can always fall back on the old
verbose debugging.

> 2) What counts as an error? We probably only want this to kick in when the
> SCSI error-recovery does. Normal 'failed commands' probably shouldn't
> count.

An error is any time the transport routine returns TRANSPORT_ERROR or the
command is aborted by the SCSI layer. We should also log SCSI-initiated
resets, but not the auto-resets for the Bulk-only protocol.

Alan Stern

2004-06-27 15:45:22

by Andries Brouwer

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 11:19:38AM -0400, Alan Stern wrote:

> My favorite approach has always been:
>
> put_be32(cmd->cdb + 2, block);

Yes, not bad.

I still prefer writing explicitly what happens.
But this is a fairly clean alternative.

Andries

2004-06-27 16:10:16

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Sonntag, 27. Juni 2004 17:23 schrieb Andries Brouwer:
> Always write simple direct obvious code. Avoid all casts.
> Uglifying code burdens maintenance. It endangers correctness.
> It is reasonable only when efficiency is really important,
> when every nanosecond counts (and the ugly code is really faster).
> That is not the case here.

No, writing the same code thousands of times makes code
unmaintainable. Finding correct and nice helper macros makes
the code good. Doing things "by foot" is the stone age way.

Regards
Oliver

2004-06-27 20:29:55

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 11:23:10 -0400 (EDT)
Alan Stern <[email protected]> wrote:

> + * -- use serial numbers to hook onto same hosts (same minor) after
> disconnect

It was a poor wording by me. It refers to the drift of naming due to
increments in usb_host_id. After a disconnect and reconnect, /dev/uba1
refers to the device, but /proc/partitions says "ubb".

To correct this, I have to set gendisk->fist_minor before calling
add_disk(), but in order to do that, a driver has to track devices
somehow. A serial number looks like an obvious candidate for a key.

-- Pete

2004-06-27 21:03:38

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 01:29:45PM -0700, Pete Zaitcev wrote:
> On Sun, 27 Jun 2004 11:23:10 -0400 (EDT)
> Alan Stern <[email protected]> wrote:
>
> > + * -- use serial numbers to hook onto same hosts (same minor) after
> > disconnect
>
> It was a poor wording by me. It refers to the drift of naming due to
> increments in usb_host_id. After a disconnect and reconnect, /dev/uba1
> refers to the device, but /proc/partitions says "ubb".
>
> To correct this, I have to set gendisk->fist_minor before calling
> add_disk(), but in order to do that, a driver has to track devices
> somehow. A serial number looks like an obvious candidate for a key.

Serial numbers are unreliable for this. We've had a long history with this
issue. Many devices do not provide a serial number. Many devices provide
a serial number, but it is not a constant. Many devices provide invalid
serial numbers.

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

IT KEEPS ASKING ME WHERE I WANT TO GO TODAY! I DONT WANT TO GO ANYWHERE!
-- Greg
User Friendly, 11/28/97


Attachments:
(No filename) (1.12 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-27 21:27:43

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 12:42:21 +0200
Oliver Neukum <[email protected]> wrote:

> OK, then it shouldn't be used in this case. However, shouldn't we have
> an attribute like __nopadding__ which does exactly that?

It would have the same effect. CPU structure layout rules don't pack
(or using other words, add padding) exactly in cases where it is
needed to obtain the necessary alignment.

2004-06-27 22:57:59

by David Brownell

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Pete Zaitcev wrote:

> This posting is largely a "release early release often" excercise, as
> Papa ESR taught us. But you can see the design outline clearly now,
> I hope, and I'm interested in feedback on that.

What hardware have you tested it with? It'd be good to know that it
works on Linux hardware running Alan's "File Storage Gadget" driver.
Which means most everything, given dummy_hcd ... :)


> +config BLK_DEV_UB
> + tristate "Low Performance USB Block driver"

Hmm, I'd have thought "low overhead" ... isn't that one
of the goals of omitting the SCSI layer?


> + /*
> + * We do not support transfers from highmem pages
> + * because the underlying USB framework does not.
> + *
> + * XXX Actually, there is a (very fresh and buggy) support
> + * for transfers under control of struct scatterlist, usb_map_sg()
> + * in 2.6.6, but it seems to have issues with highmem.
> + */

I could easily believe highmem issues, but there's nothing
preventing USB from handing highmem pages. The HCDs just
use the DMA addreses given to them, and don't care if that's
a highmem page, an ioummu mapped address, or a bounce buffer
allocated by dma mapping or other code.

That code isn't "very fresh and buggy", having been in use
with all USB-Storage devices for over a year and a half.
The bugs I've heard about have been either "device doesn't
work correctly at its peak transfer rate" (sigh) or, without
many recent reports, "wedged in cleanup after error".


> + /*
> + * This is a serious infraction, caused by a deficiency in the
> + * USB sg interface (usb_sg_wait()). We plan to remove this once
> + * we get mileage on the driver and can justify a change to USB API.
> + * See blk_queue_bounce_limit() to understand this part.
> + */
> + q->bounce_pfn = blk_max_low_pfn;
> + q->bounce_gfp = GFP_NOIO;

Well, out with it then -- what deficiency would that be? :)

It's true that EHCI doesn't do 64bit DMA on those Intel boxes,
but that's hardly a "serious" problem.

- Dave



2004-06-27 23:43:41

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 15:56:39 -0700
David Brownell <[email protected]> wrote:

> > +config BLK_DEV_UB
> > + tristate "Low Performance USB Block driver"
>
> Hmm, I'd have thought "low overhead" ... isn't that one
> of the goals of omitting the SCSI layer?

I do not care for the runtime overhead with ub. It might as well be
written in Java, copy every byte with get_user, and take a context
switch on every byte as long as it's bug free. Throwing SCSI away
throws away SCSI bugs (which are few, thanks jejb!), but also throws
away problems with interfacing to SCSI and sd.

Also, private SCSI stack allows to mimic Windows as closely as we can.
Want 36 byte inquiry? No problem! Want to ban START STOP UNIT? Be my
guest! With any luck, we'll be able to get by without anything like
unusual_devs.h (yes, I know it won't happen, but it's the ideal).

Custom error processing is a help as well. Currenly usb-storage
attempts to make SCSI eh to do the right thing by pulling very
thin threads. It is an excessively roundabout way to do things.

Performance is not a goal here. In some cases, ub might end marginally
faster than usb-storage, if only because it uses no worker thread, but
it's purely by accident.

> > + /*
> > + * This is a serious infraction, caused by a deficiency in the
> > + * USB sg interface (usb_sg_wait()). We plan to remove this once
> > + * we get mileage on the driver and can justify a change to USB API.
> > + * See blk_queue_bounce_limit() to understand this part.
> > + */
> > + q->bounce_pfn = blk_max_low_pfn;
> > + q->bounce_gfp = GFP_NOIO;
>
> Well, out with it then -- what deficiency would that be? :)

There is no way to submit a URB and give page, offset, length as arguments.
Three ways to accomplish a similar result exist currently:
0. Use bounce buffers and submit with kernel virtual address as argument.
1. Map everything yourself with "generic" DMA, then use URB_NO_TRANSFER_DMA_MAP.
This includes reading the DMA mask from the controller device, and falling
back if it is zero.
2. usb_sg_wait, which takes sg list but does not allow to submit anything
and must be called from a process.

Regardin #2 you say that ``that code isn't "very fresh and buggy", having
been in use with all USB-Storage devices for over a year and a half'' and
yet I observe that fairly serious fixes were applied just this week. What
passes muster with usb-storage is nowhere near the standard which Havoc
gave me mandate to reach and where ub shoots.

The hacking required to create usb_sg_submit() and have it sharing the
backend with usb_sg_wait is conceptually trivial. But it must be a
separate project. If it were started a year ago then I'd be happy to use
that API now. As it is, no way.

The #1 is a possibility, but it requires a little extra coding.

-- Pete

2004-06-27 23:57:21

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 00:46:53 +0200
Oliver Neukum <[email protected]> wrote:

> > +???????/* XXX Can be done at init */
> > +???????scmd->cdb[0] = REQUEST_SENSE;
> > +???????scmd->cdb_len = 6;
> > +???????scmd->dir = UB_DIR_READ;
> > +???????scmd->state = UB_CMDST_INIT;
> > +???????scmd->data = sc->top_sense;
>
> You must allocate a separate buffer to the sense data.
> We had similar code in hid which leads to data corruption
> on some architectures. It's an issue of DMA coherency.

I agree. This is also an issue for the work_bcs. I postponed doing it
right because that area waits for changes regarding queueing and
error processing in such case.

-- Pete

2004-06-28 00:10:53

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004 11:19:38 -0400 (EDT)
Alan Stern <[email protected]> wrote:

> My favorite approach has always been:
>
> put_be32(cmd->cdb + 2, block);
>
> Unfortunately there is no such function or macro! It's easy to define an
> inline function that would carry out the series of four single-byte
> assignments that originally started this discussion. A more sophisticated
> implementation would expand to Andries' unspeakably ugly code on
> big-endian platforms that don't impose a large penalty for non-aligned
> 4-byte accesses. I leave it up to others to decide which is best on
> little-endian platforms that can do unaligned accesses.
>
> I think it would be great if some such utility routine were added to a
> standard header in the kernel, together with its siblings put_le32(),
> put_be16(), put_le16(), and the corresponding get_xxxx() functions.

This is very nice and would be a great help for Infiniband developers.
However, parts of SCSI commands are not defined in terms of C structures
or even 32 bit words with an endianness. They are streams of bytes, at
least historically. Please kindly refer to the WRITE(6) command for
the evidence. You'd need put_be20() to form that block address. :-)

I write these byte marshalling sequences since 1985 and I'm a little
used to them. I do not recall thinking twice about writing that element
of ub. It probably doesn't deserve all the tempest Oliver raised over it.

-- Pete

2004-06-28 14:15:51

by Scott Wood

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 02:26:28PM -0700, David S. Miller wrote:
> On Sun, 27 Jun 2004 12:42:21 +0200
> Oliver Neukum <[email protected]> wrote:
>
> > OK, then it shouldn't be used in this case. However, shouldn't we have
> > an attribute like __nopadding__ which does exactly that?
>
> It would have the same effect. CPU structure layout rules don't pack
> (or using other words, add padding) exactly in cases where it is
> needed to obtain the necessary alignment.

No, it wouldn't, as you could drop the assumption that the base of
the struct can be misaligned. Thus, the compiler only needs to
generate unaligned loads and stores for fields which are unaligned
within the struct, which in this case would be none of them.

While it's rather unlikely that a struct like this one would ever
need packing, it would help those structs that do need it by reducing
the number of fields subjected to unaligned loads and stores.

-Scott

2004-06-28 15:05:47

by David Brownell

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Pete Zaitcev wrote:

>>>+ * This is a serious infraction, caused by a deficiency in the
>>>+ * USB sg interface (usb_sg_wait()). ...
>>
>>Well, out with it then -- what deficiency would that be? :)
>
>
> There is no way to submit a URB and give page, offset, length as arguments.

Or a kitchen sink, either. Just map(page,offset) --> dma_addr
and pass that address, with the length ... no point in changing
the URB calls just for one driver. Especially when that driver
has so many other options already available, including your three:


> 0. Use bounce buffers and submit with kernel virtual address as argument.
> 1. Map everything yourself with "generic" DMA, then use URB_NO_TRANSFER_DMA_MAP.
> This includes reading the DMA mask from the controller device, and falling
> back if it is zero.

By "generic" presumably you really mean usb_buffer_map_sg()?

Remember, the "generic" DMA calls don't take arbitrary devices,
unless arch/platform code first gets modified to understand
that type of device. Only certain kinds of devices ... and
"usb devices" for some reason aren't on those lists.

That call is what's used inside the usb_sg_init() code. It solves
several nice-to-have-solved problems. You might even be able to
use sg_init just to allocate and init the URBs you'll submit, if
you're keen on minimizing your re-use of existing code.

If you only submit one URB at a time, and an IOMMU doesn't turn
your sglist into one dma buffer, you'd surely achieve your goal
of low performance. I've usually seen at most 10 MByte/sec with
that approach, vs more than 30 MByte/sec if the queue only empties
when there's no more data to transfer.


> 2. usb_sg_wait, which takes sg list but does not allow to submit anything
> and must be called from a process.

That doesn't make sense. But what you said later starts to: you want
to have "submit" separate from "wait" (or more likely, intercept a
final completion callback instead of having to wait_for_completion).
Pretty much like that comment in usb_sg_wait() describes as an
alternate implementation of the same notion ... :)


> Regardin #2 you say that ``that code isn't "very fresh and buggy", having
> been in use with all USB-Storage devices for over a year and a half'' and
> yet I observe that fairly serious fixes were applied just this week.

I have a hard time calling any bug "serious" when only one person even
manages to report the problem in that amount of time. "Very...buggy"
is arrant nonsense, if one hard-to-trigger bug is your entire proof.


> The hacking required to create usb_sg_submit() and have it sharing the
> backend with usb_sg_wait is conceptually trivial. But it must be a
> separate project. If it were started a year ago then I'd be happy to use
> that API now. As it is, no way.

So, the basic "deficiency" is that it's not a separate project?
Still doesn't make sense to me.

- Dave




2004-06-28 15:41:17

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004, Pete Zaitcev wrote:

> On Sun, 27 Jun 2004 11:23:10 -0400 (EDT)
> Alan Stern <[email protected]> wrote:
>
> > + * -- use serial numbers to hook onto same hosts (same minor) after
> > disconnect
>
> It was a poor wording by me. It refers to the drift of naming due to
> increments in usb_host_id. After a disconnect and reconnect, /dev/uba1
> refers to the device, but /proc/partitions says "ubb".
>
> To correct this, I have to set gendisk->fist_minor before calling
> add_disk(), but in order to do that, a driver has to track devices
> somehow. A serial number looks like an obvious candidate for a key.

I don't fully understand the nature of this problem. Is it that ub always
assigns the _first_ available minor number whereas add_disk() tries to
assign the _next_ available? If so, how does tracking devices help?
Shouldn't you just always want add_disk() to use the same minor number as
ub?

Or maybe I've misunderstood completely, not just partially. In any case,
are you sure you will want to do this? The directive for not tracking
serial numbers or trying in some other way to make devices appear to be
persistent across reconnects came directly from Linus.

Alan Stern

2004-06-28 15:57:01

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004, Pete Zaitcev wrote:

> Regardin #2 you say that ``that code isn't "very fresh and buggy", having
> been in use with all USB-Storage devices for over a year and a half'' and
> yet I observe that fairly serious fixes were applied just this week.

I have to object to the reasoning here. That same sort of argument could
be applied to almost any part of the Linux kernel.

Alan Stern

2004-06-28 16:01:50

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, 27 Jun 2004, Pete Zaitcev wrote:

> This is very nice and would be a great help for Infiniband developers.
> However, parts of SCSI commands are not defined in terms of C structures
> or even 32 bit words with an endianness. They are streams of bytes, at
> least historically. Please kindly refer to the WRITE(6) command for
> the evidence. You'd need put_be20() to form that block address. :-)

About 13 years ago I wrote a data analysis program for an NMR spectrometer
that stored its output as 24-bit integers! Yes, one does encounter weird
things from time to time. Still, having the functions I mentioned would
be very handy in the majority of cases. They remove several
possibilities for error (getting the bit shifts wrong, getting the array
indexes wrong, associating the right bit shift with the wrong index).

> I write these byte marshalling sequences since 1985 and I'm a little
> used to them. I do not recall thinking twice about writing that element
> of ub. It probably doesn't deserve all the tempest Oliver raised over it.

Andries too.

Alan Stern

2004-06-28 16:24:08

by David Brownell

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Alan Stern wrote:
> On Sun, 27 Jun 2004, Pete Zaitcev wrote:
>
>
>>Regardin #2 you say that ``that code isn't "very fresh and buggy", having
>>been in use with all USB-Storage devices for over a year and a half'' and
>>yet I observe that fairly serious fixes were applied just this week.
>
>
> I have to object to the reasoning here. That same sort of argument could
> be applied to almost any part of the Linux kernel.

I was also tempted to point out that the tone was objectionable.
The implication that only Pete and Havoc have useful standards
when it comes to code quality was ... offensive. Looks like I
just gave into that temptation, eh? :)



2004-06-28 16:42:23

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Montag, 28. Juni 2004 17:40 schrieb Alan Stern:
> Or maybe I've misunderstood completely, not just partially. ?In any case,
> are you sure you will want to do this? ?The directive for not tracking
> serial numbers or trying in some other way to make devices appear to be
> persistent across reconnects came directly from Linus.

IIRC he banned reconnecting device nodes in use.
Reusing the number is legal. In fact in a finite number space there's
always a chance that the number will have to be reused.

Regards
Oliver

2004-06-28 16:46:57

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Montag, 28. Juni 2004 18:23 schrieb David Brownell:
> Alan Stern wrote:
> > On Sun, 27 Jun 2004, Pete Zaitcev wrote:
> >
> >
> >>Regardin #2 you say that ``that code isn't "very fresh and buggy", having
> >>been in use with all USB-Storage devices for over a year and a half'' and
> >>yet I observe that fairly serious fixes were applied just this week.
> >
> >
> > I have to object to the reasoning here. That same sort of argument could
> > be applied to almost any part of the Linux kernel.
>
> I was also tempted to point out that the tone was objectionable.
> The implication that only Pete and Havoc have useful standards
> when it comes to code quality was ... offensive. Looks like I
> just gave into that temptation, eh? :)

Deep down in the blackest parts of your soul do you really think
differently about your own standards of quality ;-) ?

Regards
Oliver

2004-06-28 17:13:47

by David Brownell

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Oliver Neukum wrote:

>>>I have to object to the reasoning here. That same sort of argument could
>>>be applied to almost any part of the Linux kernel.
>>
>>I was also tempted to point out that the tone was objectionable.
>>The implication that only Pete and Havoc have useful standards
>>when it comes to code quality was ... offensive. Looks like I
>>just gave into that temptation, eh? :)
>
>
> Deep down in the blackest parts of your soul do you really think
> differently about your own standards of quality ;-) ?

Certainly I do.

- Dave


2004-06-28 19:50:42

by Alan Stern

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004, Oliver Neukum wrote:

> Am Montag, 28. Juni 2004 17:40 schrieb Alan Stern:
> > Or maybe I've misunderstood completely, not just partially. ?In any case,
> > are you sure you will want to do this? ?The directive for not tracking
> > serial numbers or trying in some other way to make devices appear to be
> > persistent across reconnects came directly from Linus.
>
> IIRC he banned reconnecting device nodes in use.
> Reusing the number is legal. In fact in a finite number space there's
> always a chance that the number will have to be reused.

Sure. But then why go to the trouble of tracking serial numbers to
identify particular physical devices with particular minor numbers?

Alan Stern

2004-06-28 20:27:50

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 10:15:17 -0400
Scott Wood <[email protected]> wrote:

> On Sun, Jun 27, 2004 at 02:26:28PM -0700, David S. Miller wrote:
> > On Sun, 27 Jun 2004 12:42:21 +0200
> > Oliver Neukum <[email protected]> wrote:
> >
> > > OK, then it shouldn't be used in this case. However, shouldn't we have
> > > an attribute like __nopadding__ which does exactly that?
> >
> > It would have the same effect. CPU structure layout rules don't pack
> > (or using other words, add padding) exactly in cases where it is
> > needed to obtain the necessary alignment.
>
> No, it wouldn't, as you could drop the assumption that the base of
> the struct can be misaligned. Thus, the compiler only needs to
> generate unaligned loads and stores for fields which are unaligned
> within the struct, which in this case would be none of them.
>
> While it's rather unlikely that a struct like this one would ever
> need packing, it would help those structs that do need it by reducing
> the number of fields subjected to unaligned loads and stores.

That's true. But if one were to propose such a feature to the gcc
guys, I know the first question they would ask. "If no padding of
the structure is needed, why are you specifying this new
__nopadding__ attribute?"

I think it's bad to just "smack this attribute onto any structure that
_MIGHT_ need it on some platform" I never do that in my drivers,
and they work on all platforms. For example, if you have a simple
DMA descriptor structure such as:

struct txd {
u32 dma_addr;
u32 length;
};

It is just total and utter madness to put a packed or the proposed
__nopadding__ attribute on that structure. Yet this seems to be
what was suggested now and at the beginning of this thread.

2004-06-28 20:49:17

by Scott Wood

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 01:25:31PM -0700, David S. Miller wrote:
> That's true. But if one were to propose such a feature to the gcc
> guys, I know the first question they would ask. "If no padding of
> the structure is needed, why are you specifying this new
> __nopadding__ attribute?"

It would benefit other structures that *do* need it, but only for a
few fields.

> I think it's bad to just "smack this attribute onto any structure that
> _MIGHT_ need it on some platform" I never do that in my drivers,
> and they work on all platforms. For example, if you have a simple
> DMA descriptor structure such as:
>
> struct txd {
> u32 dma_addr;
> u32 length;
> };
>
> It is just total and utter madness to put a packed or the proposed
> __nopadding__ attribute on that structure. Yet this seems to be
> what was suggested now and at the beginning of this thread.

As long as GCC generates code as it does, sure, it's madness.

However, what if it were to be run on a machine that can't address
smaller quantities than 64-bit? Such a machine sounds silly, but it
could happen (just as early Alphas couldn't directly load or store
smaller than 32-bit quantities), and thus the compiler might want to
pad them so that they don't share a word. If a way exists to express
to the compiler that the format of the struct is intended to be
exactly as specified, without causing any detrimental effect, why not
make use of it?

As an aside, what would *really* be nice is if GCC had an attribute
that lets one specify the endianness of the field, so that one
doesn't have to mess around with conversion functions/macros
uglifying the code. It'd probably be faster, too, as the optimizer
could deal with the loads and stores like any other.

-Scott

2004-06-28 20:51:26

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 01:25:31PM -0700, David S. Miller wrote:
> I think it's bad to just "smack this attribute onto any structure that
> _MIGHT_ need it on some platform" I never do that in my drivers,
> and they work on all platforms. For example, if you have a simple
> DMA descriptor structure such as:
>
> struct txd {
> u32 dma_addr;
> u32 length;
> };
>
> It is just total and utter madness to put a packed or the proposed
> __nopadding__ attribute on that structure. Yet this seems to be
> what was suggested now and at the beginning of this thread.

I guess, in the end, what this comes down to is the fact that we're all
going to get bitten on the ass when we finally get to a platform where the
default alignment is 64-bits, which would then (by default) add padding to
the above structure.

How long until that time comes? Likely within my lifetime, and I'd rather
not have to re-write working code into more working code because I couldn't
express to the compiler what I needed it to do.

Yes, __packed__ is overkill, because it specifies both a no-wasted-space
storage as well as the possibility of a completely unaligned pointer.
__nopadding__ would, as proposed, represent what we mean more closely.

Personally, I think it would be nice to see a way to mark all structures
that are passed "over the wire" (regardless of if that wire is USB, PCI, or
whatever), so that when we move to the FooMatic4000 arch, it will
JustWork(tm) instead of being a major PITA.

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

You are needink to look more evil. You likink very strong coffee?
-- Pitr to Dust Puppy
User Friendly, 10/16/1998


Attachments:
(No filename) (1.70 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-28 20:57:07

by Oliver Neukum

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Am Montag, 28. Juni 2004 22:25 schrieb David S. Miller:
> That's true. ?But if one were to propose such a feature to the gcc
> guys, I know the first question they would ask. ?"If no padding of
> the structure is needed, why are you specifying this new
> __nopadding__ attribute?"

It would replace some uses of __packed__, where the first element
is aligned.

> I think it's bad to just "smack this attribute onto any structure that
> _MIGHT_ need it on some platform" ?I never do that in my drivers,
> and they work on all platforms. ?For example, if you have a simple
> DMA descriptor structure such as:
>
> ????????struct txd {
> ????????????????u32 dma_addr;
> ????????????????u32 length;
> ????????};
>
> It is just total and utter madness to put a packed or the proposed
> __nopadding__ attribute on that structure. ?Yet this seems to be
> what was suggested now and at the beginning of this thread.

I did suggest that. I still think it has some advantages, but I am not
sure whether they are sufficient
1) Not every structure is that simple
2) It is an architecture specific requirement
3) padding rules might change
4) It simplifies driver writing

Regards
Oliver

2004-06-28 21:00:04

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 16:48:57 -0400
Scott Wood <[email protected]> wrote:

> However, what if it were to be run on a machine that can't address
> smaller quantities than 64-bit? Such a machine sounds silly, but it
> could happen (just as early Alphas couldn't directly load or store
> smaller than 32-bit quantities),

You are still hitting right at the heart of why I think all of
this talk is madness and silly, you're staying in the realm of
"what ifs".

Cross that bridge when we get there and no sooner, ok? :-)

As a side note, even though early Alpha's could not address smaller
than word quantities directly with loads and stores, the structure
layout defined by the Alpha ABIs did not pad such elements inside
of structures. It simply emitted word sized loads, then extracted
the byte or half-word using shifts and masks.

So even if such a maniac machine as you described were created, it
would likely shift+mask out from 64-bit loads the elements it needed
instead of padding structures uselessly. Structure padding eats memory
which is why ABI designers avoid it like the plague.

2004-06-28 21:02:20

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 13:50:58 -0700
Matthew Dharm <[email protected]> wrote:

> I guess, in the end, what this comes down to is the fact that we're all
> going to get bitten on the ass when we finally get to a platform where the
> default alignment is 64-bits, which would then (by default) add padding to
> the above structure.

No it would not, see the other email I just sent out. Even when Alpha's
could not address smaller-than-word quantities with load/store instructions,
it _DID NOT_ pad such elements in it's ABI defined structure layout rules.
It did word loads, and shift+mask'd out the elements it wanted as needed.

2004-06-28 21:04:36

by Pete Zaitcev

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 13:50:58 -0700
Matthew Dharm <[email protected]> wrote:

> > struct txd {
> > u32 dma_addr;
> > u32 length;
> > };
> >
> > It is just total and utter madness to put a packed or the proposed
> > __nopadding__ attribute on that structure. Yet this seems to be
> > what was suggested now and at the beginning of this thread.
>
> I guess, in the end, what this comes down to is the fact that we're all
> going to get bitten on the ass when we finally get to a platform where the
> default alignment is 64-bits, which would then (by default) add padding to
> the above structure.
>
> How long until that time comes? Likely within my lifetime, and I'd rather
> not have to re-write working code into more working code because I couldn't
> express to the compiler what I needed it to do.

I, for one, am not engaging into such flights of fancy as a platform
with larger than natural alignment requirements. Would you even read
what you're writing? The whole freaking world abandons silly platforms
and moves to x86 extensions and you're fantasizing about a return
to Cray-1. It just ain't happening!

My ass is completely at ease about being bitten by your imaginary monsters.

-- Pete

2004-06-28 21:05:50

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 22:57:11 +0200
Oliver Neukum <[email protected]> wrote:

> Am Montag, 28. Juni 2004 22:25 schrieb David S. Miller:
> > That's true. ?But if one were to propose such a feature to the gcc
> > guys, I know the first question they would ask. ?"If no padding of
> > the structure is needed, why are you specifying this new
> > __nopadding__ attribute?"
>
> It would replace some uses of __packed__, where the first element
> is aligned.

You have not considered what is supposed to happen when this
structure is embedded within another one. What kind of alignment
rules apply in that case? For example:

struct foo {
u32 x;
u8 y;
u16 z;
} __attribute__((__packed__));

struct bar {
u8 a;
struct foo b;
};

That is why __packed__ can't assume the alignment of any structure
instance whatsoever. Your __nopadding__ attribute proposal would
lay out struct bar differently in order to meet the alignment guarentees
you say it will be able to meet.

2004-06-28 21:21:33

by Scott Wood

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 02:03:43PM -0700, David S. Miller wrote:
> You have not considered what is supposed to happen when this
> structure is embedded within another one. What kind of alignment
> rules apply in that case? For example:
>
> struct foo {
> u32 x;
> u8 y;
> u16 z;
> } __attribute__((__packed__));
>
> struct bar {
> u8 a;
> struct foo b;
> };

As long as bar is not packed, why shouldn't the beginning of bar.b be
aligned?

If you did it the other way around, and had bar packed but foo not
(or if both had nopadding specified), that would be a problem, and
should probably at least generate a warning (if not an error) if you
take the address of the inner struct.

However, doing something like that is already broken; if you use a
pointer to the inner struct rather than going through the base, GCC
will use normal loads and stores to unaligned data, without even a
warning. Fixing that, other than by disallowing it, would likely
require making unalignedness a pointer qualifier.

-Scott

2004-06-28 22:23:46

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 17:18:57 -0400
Scott Wood <[email protected]> wrote:

> On Mon, Jun 28, 2004 at 02:03:43PM -0700, David S. Miller wrote:
> > You have not considered what is supposed to happen when this
> > structure is embedded within another one. What kind of alignment
> > rules apply in that case? For example:
> >
> > struct foo {
> > u32 x;
> > u8 y;
> > u16 z;
> > } __attribute__((__packed__));
> >
> > struct bar {
> > u8 a;
> > struct foo b;
> > };
>
> As long as bar is not packed, why shouldn't the beginning of bar.b be
> aligned?

No! bar.b starts at offset 1 byte. That's how this stuff works.

This is exactly why you cannot assume the alignment of any structure
which is given attribute __packed__. The example above shows that
quite clearly.

Try it out if you don't believe someone who has maintained the
Linux networking for 7 years and sits on the GCC Steering Committee.
:-)

2004-06-28 22:31:56

by Scott Wood

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 03:22:08PM -0700, David S. Miller wrote:
> On Mon, 28 Jun 2004 17:18:57 -0400
> Scott Wood <[email protected]> wrote:
> > As long as bar is not packed, why shouldn't the beginning of bar.b be
> > aligned?
>
> No! bar.b starts at offset 1 byte. That's how this stuff works.

That may be how it does work, but why is that how it *should* work?
If I want bar packed, I'll specify it as packed. There's no reason
to keep this behavior with a nopadding attribute, as it would be a
new attribute with no existing code to break.

The important thing is that the offsets within the packed/nopadding
struct are exactly as specified, and padding the first element
doesn't change that.

-Scott

2004-06-28 22:40:51

by Roland Dreier

[permalink] [raw]
Subject: Re: drivers/block/ub.c

David> Try it out if you don't believe someone who has maintained
David> the Linux networking for 7 years and sits on the GCC
David> Steering Committee. :-)

Just so I'm sure I'm not misunderstanding, are you promising me that I
will never get bitten on the ass on any architecture with kernel code
that assumes the compiler will never add padding as long as struct
fields are naturally aligned to their size? :-)

For example, you're saying I'm definitely safe declaring a structure
(used as a HW descriptor) like say

struct mthca_qp_path {
u32 port_pkey;
u8 rnr_retry;
u8 g_mylmc;
u16 rlid;
u8 ackto;
u8 mgid_index;
u8 static_rate;
u8 hop_limit;
u32 sl_tclass_flowlabel;
u8 rgid[16];
};

without __attribute__((packed))? (Of course I'll only use this struct
aligned to its size)

Thanks,
Roland

2004-06-28 23:52:53

by Matthew Dharm

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 02:01:57PM -0700, Pete Zaitcev wrote:
> On Mon, 28 Jun 2004 13:50:58 -0700
> Matthew Dharm <[email protected]> wrote:
>
> > > struct txd {
> > > u32 dma_addr;
> > > u32 length;
> > > };
> > >
> > > It is just total and utter madness to put a packed or the proposed
> > > __nopadding__ attribute on that structure. Yet this seems to be
> > > what was suggested now and at the beginning of this thread.
> >
> > I guess, in the end, what this comes down to is the fact that we're all
> > going to get bitten on the ass when we finally get to a platform where the
> > default alignment is 64-bits, which would then (by default) add padding to
> > the above structure.
> >
> > How long until that time comes? Likely within my lifetime, and I'd rather
> > not have to re-write working code into more working code because I couldn't
> > express to the compiler what I needed it to do.
>
> I, for one, am not engaging into such flights of fancy as a platform
> with larger than natural alignment requirements. Would you even read
> what you're writing? The whole freaking world abandons silly platforms
> and moves to x86 extensions and you're fantasizing about a return
> to Cray-1. It just ain't happening!

Actually, I've been working with a controller for the last several months
(the "latest and greatest", state-of-the-art technology), which can only to
loads/stores in 64-byte units at a minimum.

Yes, I mean "byte". 128-bits wide * 4 DDR beats (2 dobule-edged clocks).
There are no DM pins on the memory interface, so any write less than that
size must be a read-modify-write.

And, I was just having a conversation with a compiler group which was
considering moving to 64-byte alignment as a speed optimization for this
platform, or at least 16-byte (64 bit), which also aligns well with the
native load/store of the CPU involved.

I only wish I was fantasizing about this.

Matt

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

P: Nine more messages in admin.policy.
M: I know, I'm typing as fast as I can!
-- Pitr and Mike
User Friendly, 11/27/97


Attachments:
(No filename) (2.14 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-28 23:59:11

by Jeff Garzik

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sun, Jun 27, 2004 at 05:45:15PM +0200, Andries Brouwer wrote:
> I still prefer writing explicitly what happens.

This is my preference for the implementation of conversions to and from
what I call 'scsi endian'.

Personally, I am pondering creation and use of cpu_to_scsiXX() and
scsiXX_to_cpu() static inline helper functions in my code, because
I'm tired of hand-coding conversions for {read,write}{6,10,12,16}.

Further, the value I see in direct cdb-to-something conversions has
declined over time. I now prefer the more clear

* scsi-to-u32
* u32-to-something

operations as preferred to a single

* scsi-to-something

operation that's hand-coded and frequently gets corner cases wrong.

Jeff


2004-06-29 01:39:49

by Robert White

[permalink] [raw]
Subject: RE: drivers/block/ub.c



David S. Miller Wrote:

> Oliver Neukum <[email protected]> wrote:

> > OK, then it shouldn't be used in this case. However, shouldn't we have
> > an attribute like __nopadding__ which does exactly that?


> It would have the same effect. CPU structure layout rules don't pack
> (or using other words, add padding) exactly in cases where it is
> needed to obtain the necessary alignment.

It "seems" that having an attribute on a structure that says "pack my internals, but
my base will be normally aligned" would be reasonable, if not easy to manage.

In such a case, the compiler would be able to recognize the poorly aligned internal
members for which the ugly code needs to be generated, and still generate the optimal
instructions for the coincidentally well-aligned accesses.

struct whatever {
u8 thing;
u16 thing2;
u8 thing3[5];
u32 thing4[10];
} attribute ((__packed__));

Has no rationally evident reason to generate ugly code for thing4 access. Most
programmers will not expect thing4[x] to suffer any degradation whatsoever even as
they understand that thing2 is unhappiness personified. There is nothing to even
remotely suggest that an instance of whatever is going to be based on a poorly
aligned pointer (etc).

That is, it is easy to imagine a programmer being able to know that the guts are
packed, but the overall placement and access is normal. As a matter of fact, that is
the "normal" way packed structures are used in the first place, since such structures
are just about always created normally (allocated, automatic, etc) and only get
__packed__ when they have to meet some exterior qualification like writing to a
device.

My presumption (and Oliver's apparently), would be that __packed__ and __unaligned__
are completely different assertions with no predicate relationship between them.
Least astonishment kind-of then dictates that __packed__ isn't going to
__ruin_access__ (8-) structures who's innards happen to be the same on a platform
where the not-__packed__ representation has the same map.

In fact, "structures are packed, instances are aligned" approaches aphorism. One
could just as easily expect have an normal, unpacked structure mapped into/over
memory at an unaligned address to meet the requirement of some particular custom
hardware or for fast-encoding into/out-of a text buffer.

Rob.


2004-06-29 01:55:41

by Robert White

[permalink] [raw]
Subject: RE: drivers/block/ub.c

The below makes no sense to me... Nothing in the definition of struct bar{} (which
is not packed) infers (top me) in the slightest that foo should be unnaturally
aligned within it.

Just because foo is internally un-aligned, it doesn't become a god-like dirty finger
to with which to corrupt external entities. If you want bar.b packed up against
bar.a then bar should be __packed__ also.

Given the alignment rules for _empty_ structures as members of structures, why would
a non-empty structure that is packed be less stringently aligned than an empty one
that isn?t?

Perhaps I am na?ve.


-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of David S. Miller
Sent: Monday, June 28, 2004 2:04 PM
To: Oliver Neukum
Cc: [email protected]; [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 22:57:11 +0200
Oliver Neukum <[email protected]> wrote:

> Am Montag, 28. Juni 2004 22:25 schrieb David S. Miller:
> > That's true. ?But if one were to propose such a feature to the gcc
> > guys, I know the first question they would ask. ?"If no padding of
> > the structure is needed, why are you specifying this new
> > __nopadding__ attribute?"
>
> It would replace some uses of __packed__, where the first element
> is aligned.

You have not considered what is supposed to happen when this
structure is embedded within another one. What kind of alignment
rules apply in that case? For example:

struct foo {
u32 x;
u8 y;
u16 z;
} __attribute__((__packed__));

struct bar {
u8 a;
struct foo b;
};

That is why __packed__ can't assume the alignment of any structure
instance whatsoever. Your __nopadding__ attribute proposal would
lay out struct bar differently in order to meet the alignment guarentees
you say it will be able to meet.

2004-06-29 02:16:00

by David Miller

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 18:54:46 -0700
"Robert White" <[email protected]> wrote:

> The below makes no sense to me... Nothing in the definition of struct bar{} (which
> is not packed) infers (top me) in the slightest that foo should be unnaturally
> aligned within it.

First of all, it is what the compiler does and has done since the
__packed__ attribute was added.

Second of all, you are asking it to "PACK" the structure, this includes
any place you place it within other data objects. It becomes an N-byte
blob that has no alignment constraints must be placed exactly where it
is declared.

I am growing very tired of this thread.

2004-06-29 02:50:26

by Robert White

[permalink] [raw]
Subject: RE: drivers/block/ub.c

Ok, that is what it does. That is not what *I* would have *expected* it to do. I
would have expected it to remain a struct when viewed from the outside rather than
become an "N-byte blob".


-----Original Message-----
From: David S. Miller [mailto:[email protected]]
Sent: Monday, June 28, 2004 7:16 PM
To: Robert White
Cc: [email protected]; [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected];
[email protected]; [email protected];
[email protected]; [email protected]
Subject: Re: drivers/block/ub.c

On Mon, 28 Jun 2004 18:54:46 -0700
"Robert White" <[email protected]> wrote:

> The below makes no sense to me... Nothing in the definition of struct bar{} (which
> is not packed) infers (top me) in the slightest that foo should be unnaturally
> aligned within it.

First of all, it is what the compiler does and has done since the
__packed__ attribute was added.

Second of all, you are asking it to "PACK" the structure, this includes
any place you place it within other data objects. It becomes an N-byte
blob that has no alignment constraints must be placed exactly where it
is declared.

I am growing very tired of this thread.


2004-06-29 07:12:48

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Mon, Jun 28, 2004 at 01:25:31PM -0700, David S. Miller wrote:
> On Mon, 28 Jun 2004 10:15:17 -0400
> Scott Wood <[email protected]> wrote:
>
> > On Sun, Jun 27, 2004 at 02:26:28PM -0700, David S. Miller wrote:
> > > On Sun, 27 Jun 2004 12:42:21 +0200
> > > Oliver Neukum <[email protected]> wrote:
> > >
> > > > OK, then it shouldn't be used in this case. However, shouldn't we have
> > > > an attribute like __nopadding__ which does exactly that?
> > >
> > > It would have the same effect. CPU structure layout rules don't pack
> > > (or using other words, add padding) exactly in cases where it is
> > > needed to obtain the necessary alignment.
> >
> > No, it wouldn't, as you could drop the assumption that the base of
> > the struct can be misaligned. Thus, the compiler only needs to
> > generate unaligned loads and stores for fields which are unaligned
> > within the struct, which in this case would be none of them.
> >
> > While it's rather unlikely that a struct like this one would ever
> > need packing, it would help those structs that do need it by reducing
> > the number of fields subjected to unaligned loads and stores.
>
> That's true. But if one were to propose such a feature to the gcc
> guys, I know the first question they would ask. "If no padding of
> the structure is needed, why are you specifying this new
> __nopadding__ attribute?"

You may have a struct, which itself might 'need' padding somewhere
inside, however the structure start will always be aligned. Using
__nopadding__ you should be much better off than using __packed_ in this
case, because GCC can use the right aligned/nonaligned accesses for the
members of the structure, because it knows which will be aligned and
which not.

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2004-06-29 11:06:00

by Jeff Garzik

[permalink] [raw]
Subject: Re: drivers/block/ub.c

On Sat, Jun 26, 2004 at 01:06:45PM -0700, Pete Zaitcev wrote:
> Hi, guys,
>
> I have drafted up an implementation of a USB storage driver as I wish
> it done (called "ub"). The main goal of the project is to produce a driver
> which never oopses, and above all, never locks up the machine. To this
> point I did all my debugging while running X11 and yapping on IRC. If this
> goal requires to sacrifice performance, so be it. This is how ub differs
> from the usb-storage.
>
> The current usb-storage works quite well on servers where netdump can
> be brought to bear, but on desktop its debuggability leaves some room
> for improvement. In all other respects, it is superior to ub. Since
> characteristics of usb-storage and ub are different I expect them to
> coexist potentially indefinitely (if ub finds any use at all).
>
> Please refer to the attached patch. It is quite raw, for instance the
> disconnect is not processed at all, although I do have a plan for it.
> This posting is largely a "release early release often" excercise, as
> Papa ESR taught us. But you can see the design outline clearly now,
> I hope, and I'm interested in feedback on that.


I can't comment on the USB portions, but the rest seems sane to me.

Jeff



2004-06-29 17:06:56

by Kurt Garloff

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Hi,

On Sun, Jun 27, 2004 at 02:26:28PM -0700, David S. Miller wrote:
> On Sun, 27 Jun 2004 12:42:21 +0200
> Oliver Neukum <[email protected]> wrote:
>
> > OK, then it shouldn't be used in this case. However, shouldn't we have
> > an attribute like __nopadding__ which does exactly that?
>
> It would have the same effect. CPU structure layout rules don't pack
> (or using other words, add padding) exactly in cases where it is
> needed to obtain the necessary alignment.

This is a compiler shortcoming then.
The compiler does know the requirements and should be able to determine
whether or not it would have addedd padding or not.

Regards,
--
Kurt Garloff <[email protected]> [Koeln, DE]
Physics:Plasma modeling <[email protected]> [TU Eindhoven, NL]
Linux: SUSE Labs (Head) <[email protected]> [SUSE Nuernberg, DE]


Attachments:
(No filename) (870.00 B)
(No filename) (189.00 B)
Download all attachments

2004-06-29 18:32:24

by Andy Isaacson

[permalink] [raw]
Subject: Re: drivers/block/ub.c

Dave, you seem to be arguing "This is how __packed__ works, therefore
this is how __packed__ works, therefore anything else is now how
__packed__ works". Oliver is trying to propose *new* semantics which
*differ* from __packed__ in a way that seems useful.

On Mon, Jun 28, 2004 at 02:03:43PM -0700, David S. Miller wrote:
> On Mon, 28 Jun 2004 22:57:11 +0200
> Oliver Neukum <[email protected]> wrote:
> > Am Montag, 28. Juni 2004 22:25 schrieb David S. Miller:
> > > That's true. ?But if one were to propose such a feature to the gcc
> > > guys, I know the first question they would ask. ?"If no padding of
> > > the structure is needed, why are you specifying this new
> > > __nopadding__ attribute?"
> >
> > It would replace some uses of __packed__, where the first element
> > is aligned.
>
> You have not considered what is supposed to happen when this
> structure is embedded within another one. What kind of alignment
> rules apply in that case? For example:
>
> struct foo { u32 x; u8 y; u16 z; } __attribute__((__packed__));
> struct bar { u8 a; struct foo b; };
>
> That is why __packed__ can't assume the alignment of any structure
> instance whatsoever. Your __nopadding__ attribute proposal would
> lay out struct bar differently in order to meet the alignment guarentees
> you say it will be able to meet.


Here's Oliver's suggestion, as I understand it:
- a __nopadding__ struct is naturally aligned for its first member.
- The compiler does not insert alignment into a __nopadding__ struct.
- From the outside, a __nopadding__ struct does not differ from a
normal struct (one lacking all attribute()s), except in its size. So
your "struct foo" above (with __nopadding__) would be 7 bytes with
4-byte alignment for the u32.

As proposed, __nopadding__ is better than __packed__ because leading
correctly-aligned elements can be accessed directly with aligned loads
rather than requiring byte-at-a-time loads on platforms such as SPARC.

To answer your question: a __nopadding__ struct embedded in another
struct will be naturally aligned just as a normal struct with the same
members would have been. (Possible variation: align it as necessary
for the first member, treat the rest as "bag 'o bits".)

It's unfortunate that GCC has conflated several not-necessarily-related
features into a single switch.

1. no padding between elements
2. no alignment internally
3. no alignment externally

This results in confusion, as Scott shows below. Worse, poorly-defined
semantics are a likely source of implementation bugs -- are you
confident that every aspect of __packed__ works the same in every
compiler that understands attribute((packed))? Including ICC and
gcc-2.6.0?

On Mon, Jun 28, 2004 at 03:22:08PM -0700, David S. Miller wrote:
> On Mon, 28 Jun 2004 17:18:57 -0400
> Scott Wood <[email protected]> wrote:
> > On Mon, Jun 28, 2004 at 02:03:43PM -0700, David S. Miller wrote:
> > > struct foo { u32 x; u8 y; u16 z; } __attribute__((__packed__));
> > > struct bar { u8 a; struct foo b; };
> >
> > As long as bar is not packed, why shouldn't the beginning of bar.b be
> > aligned?
>
> No! bar.b starts at offset 1 byte. That's how this stuff works.
>
> This is exactly why you cannot assume the alignment of any structure
> which is given attribute __packed__. The example above shows that
> quite clearly.

-andy