Here is a repost of yesterday's patches for firewire-net. I dropped the
error logging limitation.
firewire: net: count stats.tx_packets and stats.tx_bytes
firewire: net: fix memory leaks
firewire: net: replace lists by counters
firewire: net: throttle TX queue before running out of tlabels
drivers/firewire/net.c | 137 ++++++++++++++++++++++++-----------------
1 file changed, 83 insertions(+), 54 deletions(-)
--
Stefan Richter
-=====-==-=- =-== --===
http://arcgraph.de/sr/
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/net.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -907,6 +907,7 @@ static int fwnet_send_packet(struct fwne
static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
{
struct fwnet_device *dev = ptask->dev;
+ struct sk_buff *skb = ptask->skb;
unsigned long flags;
bool free;
@@ -917,8 +918,11 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && !list_empty(&ptask->pt_link));
- if (ptask->outstanding_pkts == 0)
+ if (ptask->outstanding_pkts == 0) {
list_del(&ptask->pt_link);
+ dev->netdev->stats.tx_packets++;
+ dev->netdev->stats.tx_bytes += skb->len;
+ }
spin_unlock_irqrestore(&dev->lock, flags);
@@ -927,7 +931,6 @@ static void fwnet_transmit_packet_done(s
u16 fg_off;
u16 datagram_label;
u16 lf;
- struct sk_buff *skb;
/* Update the ptask to point to the next fragment and send it */
lf = fwnet_get_hdr_lf(&ptask->hdr);
@@ -954,7 +957,7 @@ static void fwnet_transmit_packet_done(s
datagram_label = fwnet_get_hdr_dgl(&ptask->hdr);
break;
}
- skb = ptask->skb;
+
skb_pull(skb, ptask->max_payload);
if (ptask->outstanding_pkts > 1) {
fwnet_make_sf_hdr(&ptask->hdr, RFC2374_HDR_INTFRAG,
--
Stefan Richter
-=====-==-=- =-== --===
http://arcgraph.de/sr/
a) fwnet_transmit_packet_done used to poison ptask->pt_link by list_del.
If fwnet_send_packet checked later whether it was responsible to clean
up (in the border case that the TX soft IRQ was outpaced by the AT-req
tasklet on another CPU), it missed this because ptask->pt_link was no
longer shown as empty.
b) If fwnet_write_complete got an rcode other than RCODE_COMPLETE, we
missed to free the skb and ptask entirely.
Also, count stats.tx_dropped and stats.tx_errors when rcode != 0.
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/net.c | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -917,9 +917,10 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && !list_empty(&ptask->pt_link));
+ if (free)
+ list_del(&ptask->pt_link);
if (ptask->outstanding_pkts == 0) {
- list_del(&ptask->pt_link);
dev->netdev->stats.tx_packets++;
dev->netdev->stats.tx_bytes += skb->len;
}
@@ -974,6 +975,31 @@ static void fwnet_transmit_packet_done(s
fwnet_free_ptask(ptask);
}
+static void fwnet_transmit_packet_failed(struct fwnet_packet_task *ptask)
+{
+ struct fwnet_device *dev = ptask->dev;
+ unsigned long flags;
+ bool free;
+
+ spin_lock_irqsave(&dev->lock, flags);
+
+ /* One fragment failed; don't try to send remaining fragments. */
+ ptask->outstanding_pkts = 0;
+
+ /* Check whether we or the networking TX soft-IRQ is last user. */
+ free = !list_empty(&ptask->pt_link);
+ if (free)
+ list_del(&ptask->pt_link);
+
+ dev->netdev->stats.tx_dropped++;
+ dev->netdev->stats.tx_errors++;
+
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+ if (free)
+ fwnet_free_ptask(ptask);
+}
+
static void fwnet_write_complete(struct fw_card *card, int rcode,
void *payload, size_t length, void *data)
{
@@ -981,11 +1007,12 @@ static void fwnet_write_complete(struct
ptask = data;
- if (rcode == RCODE_COMPLETE)
+ if (rcode == RCODE_COMPLETE) {
fwnet_transmit_packet_done(ptask);
- else
+ } else {
fw_error("fwnet_write_complete: failed: %x\n", rcode);
- /* ??? error recovery */
+ fwnet_transmit_packet_failed(ptask);
+ }
}
static int fwnet_send_packet(struct fwnet_packet_task *ptask)
--
Stefan Richter
-=====-==-=- =-== --===
http://arcgraph.de/sr/
The current transmit code does not at all make use of
- fwnet_device.packet_list
and only very limited use of
- fwnet_device.broadcasted_list,
- fwnet_device.queued_packets.
Their current function is to track whether the TX soft-IRQ finished
dealing with an skb when the AT-req tasklet takes over, and to discard
pending tx datagrams (if there are any) when the local node is removed.
The latter does actually contain a race condition bug with TX soft-IRQ
and AT-req tasklet.
Instead of these lists and the corresponding link in fwnet_packet_task,
- a flag in fwnet_packet_task to track whether fwnet_tx is done,
- a counter of queued datagrams in fwnet_device
do the job as well.
The above mentioned theoretic race condition is resolved by letting
fwnet_remove sleep until all datagrams were flushed. It may sleep
almost arbitrarily long since fwnet_remove is executed in the context of
a multithreaded (concurrency managed) workqueue.
The type of max_payload is changed to u16 here to avoid waste in struct
fwnet_packet_task. This value cannot exceed 4096 per IEEE 1394:2008
table 16-18 (or 32678 per specification of packet headers, if there is
ever going to be something else than beta mode).
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/net.c | 73 +++++++++++++++--------------------------
1 file changed, 26 insertions(+), 47 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -7,6 +7,7 @@
*/
#include <linux/bug.h>
+#include <linux/delay.h>
#include <linux/device.h>
#include <linux/ethtool.h>
#include <linux/firewire.h>
@@ -170,15 +171,8 @@ struct fwnet_device {
struct fw_address_handler handler;
u64 local_fifo;
- /* List of packets to be sent */
- struct list_head packet_list;
- /*
- * List of packets that were broadcasted. When we get an ISO interrupt
- * one of them has been sent
- */
- struct list_head broadcasted_list;
- /* List of packets that have been sent but not yet acked */
- struct list_head sent_list;
+ /* Number of tx datagrams that have been queued but not yet acked */
+ int queued_datagrams;
struct list_head peer_list;
struct fw_card *card;
@@ -196,7 +190,7 @@ struct fwnet_peer {
unsigned pdg_size; /* pd_list size */
u16 datagram_label; /* outgoing datagram label */
- unsigned max_payload; /* includes RFC2374_FRAG_HDR_SIZE overhead */
+ u16 max_payload; /* includes RFC2374_FRAG_HDR_SIZE overhead */
int node_id;
int generation;
unsigned speed;
@@ -204,22 +198,18 @@ struct fwnet_peer {
/* This is our task struct. It's used for the packet complete callback. */
struct fwnet_packet_task {
- /*
- * ptask can actually be on dev->packet_list, dev->broadcasted_list,
- * or dev->sent_list depending on its current state.
- */
- struct list_head pt_link;
struct fw_transaction transaction;
struct rfc2734_header hdr;
struct sk_buff *skb;
struct fwnet_device *dev;
int outstanding_pkts;
- unsigned max_payload;
u64 fifo_addr;
u16 dest_node;
+ u16 max_payload;
u8 generation;
u8 speed;
+ u8 enqueued;
};
/*
@@ -916,9 +906,9 @@ static void fwnet_transmit_packet_done(s
ptask->outstanding_pkts--;
/* Check whether we or the networking TX soft-IRQ is last user. */
- free = (ptask->outstanding_pkts == 0 && !list_empty(&ptask->pt_link));
+ free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
if (free)
- list_del(&ptask->pt_link);
+ dev->queued_datagrams--;
if (ptask->outstanding_pkts == 0) {
dev->netdev->stats.tx_packets++;
@@ -987,9 +977,9 @@ static void fwnet_transmit_packet_failed
ptask->outstanding_pkts = 0;
/* Check whether we or the networking TX soft-IRQ is last user. */
- free = !list_empty(&ptask->pt_link);
+ free = ptask->enqueued;
if (free)
- list_del(&ptask->pt_link);
+ dev->queued_datagrams--;
dev->netdev->stats.tx_dropped++;
dev->netdev->stats.tx_errors++;
@@ -1070,9 +1060,11 @@ static int fwnet_send_packet(struct fwne
spin_lock_irqsave(&dev->lock, flags);
/* If the AT tasklet already ran, we may be last user. */
- free = (ptask->outstanding_pkts == 0 && list_empty(&ptask->pt_link));
+ free = (ptask->outstanding_pkts == 0 && !ptask->enqueued);
if (!free)
- list_add_tail(&ptask->pt_link, &dev->broadcasted_list);
+ ptask->enqueued = true;
+ else
+ dev->queued_datagrams--;
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1087,9 +1079,11 @@ static int fwnet_send_packet(struct fwne
spin_lock_irqsave(&dev->lock, flags);
/* If the AT tasklet already ran, we may be last user. */
- free = (ptask->outstanding_pkts == 0 && list_empty(&ptask->pt_link));
+ free = (ptask->outstanding_pkts == 0 && !ptask->enqueued);
if (!free)
- list_add_tail(&ptask->pt_link, &dev->sent_list);
+ ptask->enqueued = true;
+ else
+ dev->queued_datagrams--;
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1351,10 +1345,12 @@ static netdev_tx_t fwnet_tx(struct sk_bu
max_payload += RFC2374_FRAG_HDR_SIZE;
}
+ dev->queued_datagrams++;
+
spin_unlock_irqrestore(&dev->lock, flags);
ptask->max_payload = max_payload;
- INIT_LIST_HEAD(&ptask->pt_link);
+ ptask->enqueued = 0;
fwnet_send_packet(ptask);
@@ -1500,14 +1496,9 @@ static int fwnet_probe(struct device *_d
dev->broadcast_rcv_context = NULL;
dev->broadcast_xmt_max_payload = 0;
dev->broadcast_xmt_datagramlabel = 0;
-
dev->local_fifo = FWNET_NO_FIFO_ADDR;
-
- INIT_LIST_HEAD(&dev->packet_list);
- INIT_LIST_HEAD(&dev->broadcasted_list);
- INIT_LIST_HEAD(&dev->sent_list);
+ dev->queued_datagrams = 0;
INIT_LIST_HEAD(&dev->peer_list);
-
dev->card = card;
dev->netdev = net;
@@ -1565,7 +1556,7 @@ static int fwnet_remove(struct device *_
struct fwnet_peer *peer = dev_get_drvdata(_dev);
struct fwnet_device *dev = peer->dev;
struct net_device *net;
- struct fwnet_packet_task *ptask, *pt_next;
+ int i;
mutex_lock(&fwnet_device_mutex);
@@ -1583,21 +1574,9 @@ static int fwnet_remove(struct device *_
dev->card);
fw_iso_context_destroy(dev->broadcast_rcv_context);
}
- list_for_each_entry_safe(ptask, pt_next,
- &dev->packet_list, pt_link) {
- dev_kfree_skb_any(ptask->skb);
- kmem_cache_free(fwnet_packet_task_cache, ptask);
- }
- list_for_each_entry_safe(ptask, pt_next,
- &dev->broadcasted_list, pt_link) {
- dev_kfree_skb_any(ptask->skb);
- kmem_cache_free(fwnet_packet_task_cache, ptask);
- }
- list_for_each_entry_safe(ptask, pt_next,
- &dev->sent_list, pt_link) {
- dev_kfree_skb_any(ptask->skb);
- kmem_cache_free(fwnet_packet_task_cache, ptask);
- }
+ for (i = 0; dev->queued_datagrams && i < 5; i++)
+ ssleep(1);
+ WARN_ON(dev->queued_datagrams);
list_del(&dev->dev_link);
free_netdev(net);
--
Stefan Richter
-=====-==-=- =-== --===
http://arcgraph.de/sr/
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels used up for
unfinished split transactions. The netif_stop/wake_queue API is used
for this purpose.
The high watermark is set to considerably less than 64 (I chose 8) in
order to put less pressure on to responder peers. Still, responders
which run Linux firewire-net still need to improved to keep up with
even this small amount of outstanding split transactions. (This is
considering the case of only one responder at a time, or a primary
responder, because this is the most likely case with IP over 1394.)
This is based on the count of unfinished TX datagrams. Maybe it should
rather be based on the count of unfinished TX fragments? Many 1394a
CardBus cards accept only 1024k sized packets, i.e. require two fragments
per datagram at the standard MTU of 1500. However, the chosen watermark
is still well below the maximum number of tlabels.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted.
I wonder what a good net_device.tx_queue_len value is. I just set it
to the same value as the chosen watermark for now. Note that the net
scheduler still lets us exceed the high watermark occasionally. This
could be prevented by an earlier queued_datagrams check in fwnet_tx with
a NETDEV_TX_BUSY return but such an early check did not actually
improve performance in my testing, i.e. did not reduce busy responder
situations.
Results:
- I did not see changes to resulting throughput that were discernible
from the usual measuring noise.
- With this change, other requesters on the same node now have a
chance to get spare transaction labels while firewire-net is
transmitting.
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/net.c | 36 ++++++++++++++++++++++++++++--------
1 file changed, 28 insertions(+), 8 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -28,8 +28,14 @@
#include <asm/unaligned.h>
#include <net/arp.h>
-#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
-#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16 * 1024 ? 4 : 2)
+/* rx limits */
+#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
+#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16 * 1024 ? 4 : 2)
+
+/* tx limits */
+#define FWNET_TX_QUEUE_LEN 8 /* ? */
+#define FWNET_MAX_QUEUED_DATAGRAMS 8 /* should keep AT DMA busy enough */
+#define FWNET_MIN_QUEUED_DATAGRAMS 2
#define IEEE1394_BROADCAST_CHANNEL 31
#define IEEE1394_ALL_NODES (0xffc0 | 0x003f)
@@ -892,6 +898,20 @@ static void fwnet_free_ptask(struct fwne
kmem_cache_free(fwnet_packet_task_cache, ptask);
}
+/* caller must hold dev->lock */
+static void inc_queued_datagrams(struct fwnet_device *dev)
+{
+ if (++dev->queued_datagrams >= FWNET_MAX_QUEUED_DATAGRAMS)
+ netif_stop_queue(dev->netdev);
+}
+
+/* caller must hold dev->lock */
+static void dec_queued_datagrams(struct fwnet_device *dev)
+{
+ if (--dev->queued_datagrams == FWNET_MIN_QUEUED_DATAGRAMS)
+ netif_wake_queue(dev->netdev);
+}
+
static int fwnet_send_packet(struct fwnet_packet_task *ptask);
static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
@@ -908,7 +928,7 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
if (ptask->outstanding_pkts == 0) {
dev->netdev->stats.tx_packets++;
@@ -979,7 +999,7 @@ static void fwnet_transmit_packet_failed
/* Check whether we or the networking TX soft-IRQ is last user. */
free = ptask->enqueued;
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
dev->netdev->stats.tx_dropped++;
dev->netdev->stats.tx_errors++;
@@ -1064,7 +1084,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1083,7 +1103,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1345,7 +1365,7 @@ static netdev_tx_t fwnet_tx(struct sk_bu
max_payload += RFC2374_FRAG_HDR_SIZE;
}
- dev->queued_datagrams++;
+ inc_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1415,7 +1435,7 @@ static void fwnet_init_dev(struct net_de
net->addr_len = FWNET_ALEN;
net->hard_header_len = FWNET_HLEN;
net->type = ARPHRD_IEEE1394;
- net->tx_queue_len = 10;
+ net->tx_queue_len = FWNET_TX_QUEUE_LEN;
SET_ETHTOOL_OPS(net, &fwnet_ethtool_ops);
}
--
Stefan Richter
-=====-==-=- =-== --===
http://arcgraph.de/sr/
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels used up for
unfinished split transactions. The netif_stop/wake_queue API is used
for this purpose.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.
The high watermark is set to considerably less than 64 (I chose 8)
because peers which run current Linux firewire-ohci are still easily
saturated by this (i.e. some datagrams are dropped with ack-busy-*
events), depending on the hardware at transmitter and receiver side.
I did not see changes to resulting throughput that were discernible from
the usual measuring noise. To do: Revisit the choice of queue depth
once firewire-ohci's AR DMA was improved.
I wonder what a good net_device.tx_queue_len value is. I just set it
to the same value as the chosen watermark for now.
Signed-off-by: Stefan Richter <[email protected]>
---
Update: Stricter version with an early NETDEV_TX_BUSY return if the
.ndo_start_xmit method is called while the driver is stopping (or has
stopped) the transmit queue. Thus there can really be never more than
FWNET_MAX_QUEUED_DATAGRAMS of pending outbound 1394 transactions.
drivers/firewire/net.c | 53 ++++++++++++++++++++++++++++++-----------
1 file changed, 39 insertions(+), 14 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -28,8 +28,15 @@
#include <asm/unaligned.h>
#include <net/arp.h>
-#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
-#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16 * 1024 ? 4 : 2)
+/* rx limits */
+#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
+#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16*1024 ? 4 : 2)
+
+/* tx limits */
+#define FWNET_MAX_QUEUED_DATAGRAMS 8 /* should keep AT DMA busy enough */
+#define FWNET_MIN_QUEUED_DATAGRAMS 2
+#define FWNET_TX_QUEUE_STOPPED FWNET_MAX_QUEUED_DATAGRAMS
+#define FWNET_TX_QUEUE_LEN FWNET_MAX_QUEUED_DATAGRAMS /* ? */
#define IEEE1394_BROADCAST_CHANNEL 31
#define IEEE1394_ALL_NODES (0xffc0 | 0x003f)
@@ -892,6 +899,16 @@ static void fwnet_free_ptask(struct fwne
kmem_cache_free(fwnet_packet_task_cache, ptask);
}
+/* Caller must hold dev->lock. */
+static void dec_queued_datagrams(struct fwnet_device *dev)
+{
+ if (--dev->queued_datagrams ==
+ FWNET_MIN_QUEUED_DATAGRAMS + FWNET_TX_QUEUE_STOPPED) {
+ dev->queued_datagrams -= FWNET_TX_QUEUE_STOPPED;
+ netif_wake_queue(dev->netdev);
+ }
+}
+
static int fwnet_send_packet(struct fwnet_packet_task *ptask);
static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
@@ -908,7 +925,7 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
if (ptask->outstanding_pkts == 0) {
dev->netdev->stats.tx_packets++;
@@ -979,7 +996,7 @@ static void fwnet_transmit_packet_failed
/* Check whether we or the networking TX soft-IRQ is last user. */
free = ptask->enqueued;
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
dev->netdev->stats.tx_dropped++;
dev->netdev->stats.tx_errors++;
@@ -1064,7 +1081,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1083,7 +1100,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1249,6 +1266,14 @@ static netdev_tx_t fwnet_tx(struct sk_bu
struct fwnet_peer *peer;
unsigned long flags;
+ spin_lock_irqsave(&dev->lock, flags);
+
+ if (dev->queued_datagrams > FWNET_MAX_QUEUED_DATAGRAMS) {
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+ return NETDEV_TX_BUSY;
+ }
+
ptask = kmem_cache_alloc(fwnet_packet_task_cache, GFP_ATOMIC);
if (ptask == NULL)
goto fail;
@@ -1267,9 +1292,6 @@ static netdev_tx_t fwnet_tx(struct sk_bu
proto = hdr_buf.h_proto;
dg_size = skb->len;
- /* serialize access to peer, including peer->datagram_label */
- spin_lock_irqsave(&dev->lock, flags);
-
/*
* Set the transmission type for the packet. ARP packets and IP
* broadcast packets are sent via GASP.
@@ -1291,7 +1313,7 @@ static netdev_tx_t fwnet_tx(struct sk_bu
peer = fwnet_peer_find_by_guid(dev, be64_to_cpu(guid));
if (!peer || peer->fifo == FWNET_NO_FIFO_ADDR)
- goto fail_unlock;
+ goto fail;
generation = peer->generation;
dest_node = peer->node_id;
@@ -1345,7 +1367,10 @@ static netdev_tx_t fwnet_tx(struct sk_bu
max_payload += RFC2374_FRAG_HDR_SIZE;
}
- dev->queued_datagrams++;
+ if (++dev->queued_datagrams == FWNET_MAX_QUEUED_DATAGRAMS) {
+ dev->queued_datagrams += FWNET_TX_QUEUE_STOPPED;
+ netif_stop_queue(dev->netdev);
+ }
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1356,9 +1381,9 @@ static netdev_tx_t fwnet_tx(struct sk_bu
return NETDEV_TX_OK;
- fail_unlock:
- spin_unlock_irqrestore(&dev->lock, flags);
fail:
+ spin_unlock_irqrestore(&dev->lock, flags);
+
if (ptask)
kmem_cache_free(fwnet_packet_task_cache, ptask);
@@ -1415,7 +1440,7 @@ static void fwnet_init_dev(struct net_de
net->addr_len = FWNET_ALEN;
net->hard_header_len = FWNET_HLEN;
net->type = ARPHRD_IEEE1394;
- net->tx_queue_len = 10;
+ net->tx_queue_len = FWNET_TX_QUEUE_LEN;
SET_ETHTOOL_OPS(net, &fwnet_ethtool_ops);
}
--
Stefan Richter
-=====-==-=- =-== -==-=
http://arcgraph.de/sr/
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels were used up
for unfinished split transactions. The netif_stop/wake_queue API is
used for this purpose.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.
The high watermark is set to considerably less than 64 (I chose 8)
because peers which run current Linux firewire-ohci are still easily
saturated by this (i.e. some datagrams are dropped with ack-busy-*
events), depending on the hardware at transmitter and receiver side.
I did not see changes to resulting throughput that were discernible from
the usual measuring noise. To do: Revisit the choice of queue depth
once firewire-ohci's AR DMA was improved.
I wonder what a good net_device.tx_queue_len value is. I just set it
to the same value as the chosen watermark for now.
Note: This removes some netif_wake_queue from reception code paths.
They were apparently copy&paste artefacts from a nonsensical
netif_wake_queue use in the older eth1394 driver. This belongs only
into the transmit path.
Signed-off-by: Stefan Richter <[email protected]>
---
Update 2: Maxim told me to de-obfuscate status tracking. I realized
that netif_queue_stopped can be used for that and thereby noticed bogus
usages of it in the rx path.
drivers/firewire/net.c | 59 ++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 24 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -28,8 +28,14 @@
#include <asm/unaligned.h>
#include <net/arp.h>
-#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
-#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16 * 1024 ? 4 : 2)
+/* rx limits */
+#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
+#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16*1024 ? 4 : 2)
+
+/* tx limits */
+#define FWNET_MAX_QUEUED_DATAGRAMS 8 /* should keep AT DMA busy enough */
+#define FWNET_MIN_QUEUED_DATAGRAMS 2
+#define FWNET_TX_QUEUE_LEN FWNET_MAX_QUEUED_DATAGRAMS /* ? */
#define IEEE1394_BROADCAST_CHANNEL 31
#define IEEE1394_ALL_NODES (0xffc0 | 0x003f)
@@ -641,8 +647,6 @@ static int fwnet_finish_incoming_packet(
net->stats.rx_packets++;
net->stats.rx_bytes += skb->len;
}
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
return 0;
@@ -651,8 +655,6 @@ static int fwnet_finish_incoming_packet(
net->stats.rx_dropped++;
dev_kfree_skb_any(skb);
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
return -ENOENT;
}
@@ -784,15 +786,10 @@ static int fwnet_incoming_packet(struct
* Datagram is not complete, we're done for the
* moment.
*/
- spin_unlock_irqrestore(&dev->lock, flags);
-
- return 0;
+ retval = 0;
fail:
spin_unlock_irqrestore(&dev->lock, flags);
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
-
return retval;
}
@@ -892,6 +889,13 @@ static void fwnet_free_ptask(struct fwne
kmem_cache_free(fwnet_packet_task_cache, ptask);
}
+/* Caller must hold dev->lock. */
+static void dec_queued_datagrams(struct fwnet_device *dev)
+{
+ if (--dev->queued_datagrams == FWNET_MIN_QUEUED_DATAGRAMS)
+ netif_wake_queue(dev->netdev);
+}
+
static int fwnet_send_packet(struct fwnet_packet_task *ptask);
static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
@@ -908,7 +912,7 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
if (ptask->outstanding_pkts == 0) {
dev->netdev->stats.tx_packets++;
@@ -979,7 +983,7 @@ static void fwnet_transmit_packet_failed
/* Check whether we or the networking TX soft-IRQ is last user. */
free = ptask->enqueued;
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
dev->netdev->stats.tx_dropped++;
dev->netdev->stats.tx_errors++;
@@ -1064,7 +1068,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1083,7 +1087,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1249,6 +1253,15 @@ static netdev_tx_t fwnet_tx(struct sk_bu
struct fwnet_peer *peer;
unsigned long flags;
+ spin_lock_irqsave(&dev->lock, flags);
+
+ /* Can this happen? */
+ if (netif_queue_stopped(dev->netdev)) {
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+ return NETDEV_TX_BUSY;
+ }
+
ptask = kmem_cache_alloc(fwnet_packet_task_cache, GFP_ATOMIC);
if (ptask == NULL)
goto fail;
@@ -1267,9 +1280,6 @@ static netdev_tx_t fwnet_tx(struct sk_bu
proto = hdr_buf.h_proto;
dg_size = skb->len;
- /* serialize access to peer, including peer->datagram_label */
- spin_lock_irqsave(&dev->lock, flags);
-
/*
* Set the transmission type for the packet. ARP packets and IP
* broadcast packets are sent via GASP.
@@ -1291,7 +1301,7 @@ static netdev_tx_t fwnet_tx(struct sk_bu
peer = fwnet_peer_find_by_guid(dev, be64_to_cpu(guid));
if (!peer || peer->fifo == FWNET_NO_FIFO_ADDR)
- goto fail_unlock;
+ goto fail;
generation = peer->generation;
dest_node = peer->node_id;
@@ -1345,7 +1355,8 @@ static netdev_tx_t fwnet_tx(struct sk_bu
max_payload += RFC2374_FRAG_HDR_SIZE;
}
- dev->queued_datagrams++;
+ if (++dev->queued_datagrams == FWNET_MAX_QUEUED_DATAGRAMS)
+ netif_stop_queue(dev->netdev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1356,9 +1367,9 @@ static netdev_tx_t fwnet_tx(struct sk_bu
return NETDEV_TX_OK;
- fail_unlock:
- spin_unlock_irqrestore(&dev->lock, flags);
fail:
+ spin_unlock_irqrestore(&dev->lock, flags);
+
if (ptask)
kmem_cache_free(fwnet_packet_task_cache, ptask);
@@ -1415,7 +1426,7 @@ static void fwnet_init_dev(struct net_de
net->addr_len = FWNET_ALEN;
net->hard_header_len = FWNET_HLEN;
net->type = ARPHRD_IEEE1394;
- net->tx_queue_len = 10;
+ net->tx_queue_len = FWNET_TX_QUEUE_LEN;
SET_ETHTOOL_OPS(net, &fwnet_ethtool_ops);
}
--
Stefan Richter
-=====-==-=- =-== -==-=
http://arcgraph.de/sr/
On Sat, 2010-11-13 at 23:07 +0100, Stefan Richter wrote:
> This prevents firewire-net from submitting write requests in fast
> succession until failure due to all 64 transaction labels were used up
> for unfinished split transactions. The netif_stop/wake_queue API is
> used for this purpose.
>
> Without this stop/wake mechanism, datagrams were simply lost whenever
> the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
> also prevented other unrelated outbound transactions to be initiated.
>
> The high watermark is set to considerably less than 64 (I chose 8)
> because peers which run current Linux firewire-ohci are still easily
> saturated by this (i.e. some datagrams are dropped with ack-busy-*
> events), depending on the hardware at transmitter and receiver side.
>
> I did not see changes to resulting throughput that were discernible from
> the usual measuring noise. To do: Revisit the choice of queue depth
> once firewire-ohci's AR DMA was improved.
>
> I wonder what a good net_device.tx_queue_len value is. I just set it
> to the same value as the chosen watermark for now.
>
> Note: This removes some netif_wake_queue from reception code paths.
> They were apparently copy&paste artefacts from a nonsensical
> netif_wake_queue use in the older eth1394 driver. This belongs only
> into the transmit path.
>
> Signed-off-by: Stefan Richter <[email protected]>
> ---
> Update 2: Maxim told me to de-obfuscate status tracking. I realized
> that netif_queue_stopped can be used for that and thereby noticed bogus
> usages of it in the rx path.
In fact after lot of testing I see that original patch,
'[PATCH 4/4] firewire: net: throttle TX queue before running out of
tlabels' works the best here.
With AR fixes, I don't see even a single fwnet_write_complete error on
ether side.
However the 'update 2' (maybe update 1 too, didn't test), lowers
desktop->laptop throughput somewhat.
(250 vs 227 Mbits/s). I tested this many times.
Actuall raw troughput possible with UDP stream and ether no throttling
or higher packets in flight count (I tested 50/30), it 280 Mbits/s.
BTW, I still don't understand fully why my laptop sends only at 180
Mbits/s pretty much always regardless of patches or TCP/UDP.
I also tested performance impact of other patches, and it is too small
to see through the noise.
Not bad, ah? From complete trainwreck, the IP over 1394, turned out into
very stable and fast connection that beats 100 Mbit ethernet a bit.
Now next on my list a POC (Piece of cake) items.
I need to figure out why s2ram hoses the network connection.
In fact usually, firewire-ohci does work, and reload of firewire-net
restores the connection.
Also, I need to add all required bits to make firewire-net work with NM.
I need to make resets more robust. Currently after cable plug it takes
some time until connection starts working.
So thanks again, especially to Clemens Ladisch for the hardest fixes
that made all this possible.
And of course feel free to not merge my AR rewrite, it is mostly done
as a prof of concept to see if my hardware is buggy or not.
I am sure these patches can be done better.
Best regards,
Maxim Levitsky
Maxim Levitsky wrote:
> In fact after lot of testing I see that original patch,
> '[PATCH 4/4] firewire: net: throttle TX queue before running out of
> tlabels' works the best here.
> With AR fixes, I don't see even a single fwnet_write_complete error on
> ether side.
Well, that version missed that the rx path opened up the tx queue again. I.e.
it did not work as intended.
> However the 'update 2' (maybe update 1 too, didn't test), lowers
> desktop->laptop throughput somewhat.
> (250 vs 227 Mbits/s). I tested this many times.
>
> Actuall raw troughput possible with UDP stream and ether no throttling
> or higher packets in flight count (I tested 50/30), it 280 Mbits/s.
Good, I will test deeper queues with a few different controllers here. As
long as we keep a margin to 64 so that other traffic besides IPover1394 still
has a chance to acquire transaction labels, it's OK.
> BTW, I still don't understand fully why my laptop sends only at 180
> Mbits/s pretty much always regardless of patches or TCP/UDP.
If it is not CPU bound, then it is because Ricoh did not optimize the AR DMA
unit as well as Texas Instruments did.
--
Stefan Richter
-=====-==-=- =-== -===-
http://arcgraph.de/sr/
On Sun, 2010-11-14 at 10:25 +0100, Stefan Richter wrote:
> Maxim Levitsky wrote:
> > In fact after lot of testing I see that original patch,
> > '[PATCH 4/4] firewire: net: throttle TX queue before running out of
> > tlabels' works the best here.
> > With AR fixes, I don't see even a single fwnet_write_complete error on
> > ether side.
>
> Well, that version missed that the rx path opened up the tx queue again. I.e.
> it did not work as intended.
>
> > However the 'update 2' (maybe update 1 too, didn't test), lowers
> > desktop->laptop throughput somewhat.
> > (250 vs 227 Mbits/s). I tested this many times.
> >
> > Actuall raw troughput possible with UDP stream and ether no throttling
> > or higher packets in flight count (I tested 50/30), it 280 Mbits/s.
>
> Good, I will test deeper queues with a few different controllers here. As
> long as we keep a margin to 64 so that other traffic besides IPover1394 still
> has a chance to acquire transaction labels, it's OK.
Just tested the 'update 2' with 8-16 margin. Gives me ~250 Mbits/s TCP
easily, and ~280 Mbit/s UDP. Pretty much the maximum its possible to get
out of this hardware.
>
> > BTW, I still don't understand fully why my laptop sends only at 180
> > Mbits/s pretty much always regardless of patches or TCP/UDP.
>
> If it is not CPU bound, then it is because Ricoh did not optimize the AR DMA
> unit as well as Texas Instruments did.
You mean AT, because in the fast case (desktop->laptop), the TI
transmits and Ricoh receives. In slow case Ricoh receives and TI
transmits.
Anyway speeds of new stack beat the old one by significant margin.
Best regards,
Maxim Levitsky
On 14 Nov, Maxim Levitsky wrote:
> On Sun, 2010-11-14 at 10:25 +0100, Stefan Richter wrote:
>> Maxim Levitsky wrote:
>> > However the 'update 2' (maybe update 1 too, didn't test), lowers
>> > desktop->laptop throughput somewhat.
>> > (250 vs 227 Mbits/s). I tested this many times.
>> >
>> > Actuall raw troughput possible with UDP stream and ether no throttling
>> > or higher packets in flight count (I tested 50/30), it 280 Mbits/s.
>>
>> Good, I will test deeper queues with a few different controllers here. As
>> long as we keep a margin to 64 so that other traffic besides IPover1394 still
>> has a chance to acquire transaction labels, it's OK.
> Just tested the 'update 2' with 8-16 margin. Gives me ~250 Mbits/s TCP
> easily, and ~280 Mbit/s UDP. Pretty much the maximum its possible to get
> out of this hardware.
Good, update below. Tested also with an OS X peer on my side to exclude
throughput regression.
>> > BTW, I still don't understand fully why my laptop sends only at 180
>> > Mbits/s pretty much always regardless of patches or TCP/UDP.
>>
>> If it is not CPU bound, then it is because Ricoh did not optimize the AR DMA
>> unit as well as Texas Instruments did.
> You mean AT, because in the fast case (desktop->laptop), the TI
> transmits and Ricoh receives. In slow case Ricoh receives and TI
> transmits.
Yes, I meant to write 'AT'.
> Anyway speeds of new stack beat the old one by significant margin.
Gap count optimization surely plays a big role in this.
---- 8< ----
[PATCH update 3] firewire: net: throttle TX queue before running out of tlabels
This prevents firewire-net from submitting write requests in fast
succession until failure due to all 64 transaction labels were used up
for unfinished split transactions. The netif_stop/wake_queue API is
used for this purpose.
Without this stop/wake mechanism, datagrams were simply lost whenever
the tlabel pool was exhausted. Plus, tlabel exhaustion by firewire-net
also prevented other unrelated outbound transactions to be initiated.
The chosen queue depth was checked by me to hit the maximum possible
throughput with an OS X peer whose receive DMA is good enough to never
reject requests due to busy inbound request FIFO. Current Linux peers
show a mixed picture of -5%...+15% change in bandwidth; their current
bottleneck are RCODE_BUSY situations (fewer or more, depending on TX
queue depth) due to too small AR buffer in firewire-ohci.
Maxim Levitsky tested this change with similar watermarks with a Linux
peer and some pending firewire-ohci improvements that address the
RCODE_BUSY problem and confirmed that these TX queue limits are good.
Note: This removes some netif_wake_queue from reception code paths.
They were apparently copy&paste artefacts from a nonsensical
netif_wake_queue use in the older eth1394 driver. This belongs only
into the transmit path.
Signed-off-by: Stefan Richter <[email protected]>
Tested-by: Maxim Levitsky <[email protected]>
---
drivers/firewire/net.c | 59 ++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 24 deletions(-)
Index: b/drivers/firewire/net.c
===================================================================
--- a/drivers/firewire/net.c
+++ b/drivers/firewire/net.c
@@ -28,8 +28,14 @@
#include <asm/unaligned.h>
#include <net/arp.h>
-#define FWNET_MAX_FRAGMENTS 25 /* arbitrary limit */
-#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16 * 1024 ? 4 : 2)
+/* rx limits */
+#define FWNET_MAX_FRAGMENTS 30 /* arbitrary, > TX queue depth */
+#define FWNET_ISO_PAGE_COUNT (PAGE_SIZE < 16*1024 ? 4 : 2)
+
+/* tx limits */
+#define FWNET_MAX_QUEUED_DATAGRAMS 20 /* < 64 = number of tlabels */
+#define FWNET_MIN_QUEUED_DATAGRAMS 10 /* should keep AT DMA busy enough */
+#define FWNET_TX_QUEUE_LEN FWNET_MAX_QUEUED_DATAGRAMS /* ? */
#define IEEE1394_BROADCAST_CHANNEL 31
#define IEEE1394_ALL_NODES (0xffc0 | 0x003f)
@@ -641,8 +647,6 @@ static int fwnet_finish_incoming_packet(
net->stats.rx_packets++;
net->stats.rx_bytes += skb->len;
}
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
return 0;
@@ -651,8 +655,6 @@ static int fwnet_finish_incoming_packet(
net->stats.rx_dropped++;
dev_kfree_skb_any(skb);
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
return -ENOENT;
}
@@ -784,15 +786,10 @@ static int fwnet_incoming_packet(struct
* Datagram is not complete, we're done for the
* moment.
*/
- spin_unlock_irqrestore(&dev->lock, flags);
-
- return 0;
+ retval = 0;
fail:
spin_unlock_irqrestore(&dev->lock, flags);
- if (netif_queue_stopped(net))
- netif_wake_queue(net);
-
return retval;
}
@@ -892,6 +889,13 @@ static void fwnet_free_ptask(struct fwne
kmem_cache_free(fwnet_packet_task_cache, ptask);
}
+/* Caller must hold dev->lock. */
+static void dec_queued_datagrams(struct fwnet_device *dev)
+{
+ if (--dev->queued_datagrams == FWNET_MIN_QUEUED_DATAGRAMS)
+ netif_wake_queue(dev->netdev);
+}
+
static int fwnet_send_packet(struct fwnet_packet_task *ptask);
static void fwnet_transmit_packet_done(struct fwnet_packet_task *ptask)
@@ -908,7 +912,7 @@ static void fwnet_transmit_packet_done(s
/* Check whether we or the networking TX soft-IRQ is last user. */
free = (ptask->outstanding_pkts == 0 && ptask->enqueued);
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
if (ptask->outstanding_pkts == 0) {
dev->netdev->stats.tx_packets++;
@@ -979,7 +983,7 @@ static void fwnet_transmit_packet_failed
/* Check whether we or the networking TX soft-IRQ is last user. */
free = ptask->enqueued;
if (free)
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
dev->netdev->stats.tx_dropped++;
dev->netdev->stats.tx_errors++;
@@ -1064,7 +1068,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1083,7 +1087,7 @@ static int fwnet_send_packet(struct fwne
if (!free)
ptask->enqueued = true;
else
- dev->queued_datagrams--;
+ dec_queued_datagrams(dev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1249,6 +1253,15 @@ static netdev_tx_t fwnet_tx(struct sk_bu
struct fwnet_peer *peer;
unsigned long flags;
+ spin_lock_irqsave(&dev->lock, flags);
+
+ /* Can this happen? */
+ if (netif_queue_stopped(dev->netdev)) {
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+ return NETDEV_TX_BUSY;
+ }
+
ptask = kmem_cache_alloc(fwnet_packet_task_cache, GFP_ATOMIC);
if (ptask == NULL)
goto fail;
@@ -1267,9 +1280,6 @@ static netdev_tx_t fwnet_tx(struct sk_bu
proto = hdr_buf.h_proto;
dg_size = skb->len;
- /* serialize access to peer, including peer->datagram_label */
- spin_lock_irqsave(&dev->lock, flags);
-
/*
* Set the transmission type for the packet. ARP packets and IP
* broadcast packets are sent via GASP.
@@ -1291,7 +1301,7 @@ static netdev_tx_t fwnet_tx(struct sk_bu
peer = fwnet_peer_find_by_guid(dev, be64_to_cpu(guid));
if (!peer || peer->fifo == FWNET_NO_FIFO_ADDR)
- goto fail_unlock;
+ goto fail;
generation = peer->generation;
dest_node = peer->node_id;
@@ -1345,7 +1355,8 @@ static netdev_tx_t fwnet_tx(struct sk_bu
max_payload += RFC2374_FRAG_HDR_SIZE;
}
- dev->queued_datagrams++;
+ if (++dev->queued_datagrams == FWNET_MAX_QUEUED_DATAGRAMS)
+ netif_stop_queue(dev->netdev);
spin_unlock_irqrestore(&dev->lock, flags);
@@ -1356,9 +1367,9 @@ static netdev_tx_t fwnet_tx(struct sk_bu
return NETDEV_TX_OK;
- fail_unlock:
- spin_unlock_irqrestore(&dev->lock, flags);
fail:
+ spin_unlock_irqrestore(&dev->lock, flags);
+
if (ptask)
kmem_cache_free(fwnet_packet_task_cache, ptask);
@@ -1415,7 +1426,7 @@ static void fwnet_init_dev(struct net_de
net->addr_len = FWNET_ALEN;
net->hard_header_len = FWNET_HLEN;
net->type = ARPHRD_IEEE1394;
- net->tx_queue_len = 10;
+ net->tx_queue_len = FWNET_TX_QUEUE_LEN;
SET_ETHTOOL_OPS(net, &fwnet_ethtool_ops);
}
--
Stefan Richter
-=====-==-=- =-== -===-
http://arcgraph.de/sr/
I have unexpected progress on remaining issues in firewire-net in regard
to loss of connection after s2ram cycle, and annoying fact that after
cable replug (intentional of course), it takes time for connection to
reestablish. These are separate issues, and I know the exact cause of
both (and as a side effect I now know exactly how what iso transcations
are and how do they work.)
Problem #1: large delay after cable removal/insert cycle.
The reason is that IP over 1394 abuses ARP packets so that they carry
additional vital information describing the node (namely the bus address
that is used for block address, or as they call it the fifo address).
ARP packets also carry less vital pieces of information namely maximum
transfer size (max_rec) and maximum supported speed of the sender node.
The problem here is that bus reset makes these pieces of information
invalid, and more that that the target node and its fw_peer information
disappear, and reappear but without the above fields set.
The network core is of course unaware of such ugly abuse, and thus it
doesn't send an ARP packet to the destanation. In fact it won't even
send it if destanation node is explicitly addressed. because it appears
in the ARP cache.
The solution here is somehow tell the network core to invalidate the ARP
entry for the target node as soon as it disappears.
Don't yet know how to do that.
Actually to demonstrate this problem its enough to execute 'arpping' and
it will instantly make connection work.
And lastly of course eventually connection establishes because kernel
sends ARP requests periodicity to validate the destination network node.
Problem #2:
As was described in problem #1, its obvious that after suspend to ram,
to reestablish connection we need an ARP reply.
The problem is that it is received via iso channel, and it isn't
reinitialized after s2ram.
A quick and dirty hack to stop/start the ISO channel from fwnet_update
in firewire-net 'fixes' that problem.
A better solution seemed to make the firewire-ohci reinit all ISO
channels after s2ram cycle. But this is actually wrong.
That is because 1394 spec specifies that first of all the ISO channel
must be allocated from the IRM node. The firewire stack currently just
uses hardcoded numbers in two places the ISO is used
(firewire-net, and firedtv)
However it has all functions implemented for this.
Secondary that allocation must be redone on each bus reset.
Even more that that, since 1394 spec doesn't define a way to address a
channel to a specific client, that must be done in protocol specific
way.
This means that on each bus reset all drivers that use ISO channels must
allocate them again, and inform underlying hardware they serve.
Therefore the first solution is actually the correct one.
In case of firewire-net, it is simpler, because it uses the broadcast
channel, so it only has to find who is the IRM and read its
BROADCAST_CHANNEL.
However, I think I need to write a function to query the IRM its
broadcast channel, don't think it has one.
Speaking of IRM discovery, the spec says it should be a node with
contender bit and largest node id. However, the code in core-topology.c,
build_tree seems to take the node that sent the selfID packet last.
Best regards,
Maxim Levitsky
On Nov 15 Maxim Levitsky wrote:
> That is because 1394 spec specifies that first of all the ISO channel
> must be allocated from the IRM node. The firewire stack currently just
> uses hardcoded numbers in two places the ISO is used
> (firewire-net, and firedtv)
> However it has all functions implemented for this.
This is a bug (missing feature) in firedtv but not in firewire-net. The
broadcast channel is allocated and reallocated by the bus manager, not
by an IP-over-1394 user. But you found that out already, below.
Channel allocation and DMA context allocation and control are unrelated
issues, on the other hand. One is a higher-level cooperative protocol
for bus-wide resource management (in which the nodes' roles are defined
in various different ways by protocols such as AV/C's CMP or by IIDC).
The other is about hardware control locally in the OHCI bus bridge
hardware.
[...]
> In case of firewire-net, it is simpler, because it uses the broadcast
> channel, so it only has to find who is the IRM and read its
> BROADCAST_CHANNEL.
>
> However, I think I need to write a function to query the IRM its
> broadcast channel, don't think it has one.
Overkill. Just leave it as is:
1.) We know which number the broadcast channel has.
2.) We optimistically assume that a 1394a compliant IRM or bus
manager exists and allocated that channel.
Why introduce entirely unnecessary fragility?
> Speaking of IRM discovery, the spec says it should be a node with
> contender bit and largest node id. However, the code in
> core-topology.c, build_tree seems to take the node that sent the
> selfID packet last.
This is because of a monotony rule of how self ID packets arrive during
self identification phase.
--
Stefan Richter
-=====-==-=- =-== -====
http://arcgraph.de/sr/
On Mon, 2010-11-15 at 09:01 +0100, Stefan Richter wrote:
> On Nov 15 Maxim Levitsky wrote:
> > That is because 1394 spec specifies that first of all the ISO channel
> > must be allocated from the IRM node. The firewire stack currently just
> > uses hardcoded numbers in two places the ISO is used
> > (firewire-net, and firedtv)
> > However it has all functions implemented for this.
>
> This is a bug (missing feature) in firedtv but not in firewire-net. The
> broadcast channel is allocated and reallocated by the bus manager, not
> by an IP-over-1394 user. But you found that out already, below.
Agree fully.
>
> Channel allocation and DMA context allocation and control are unrelated
> issues, on the other hand. One is a higher-level cooperative protocol
> for bus-wide resource management (in which the nodes' roles are defined
> in various different ways by protocols such as AV/C's CMP or by IIDC).
> The other is about hardware control locally in the OHCI bus bridge
> hardware.
>
> [...]
> > In case of firewire-net, it is simpler, because it uses the broadcast
> > channel, so it only has to find who is the IRM and read its
> > BROADCAST_CHANNEL.
> >
> > However, I think I need to write a function to query the IRM its
> > broadcast channel, don't think it has one.
>
> Overkill. Just leave it as is:
> 1.) We know which number the broadcast channel has.
> 2.) We optimistically assume that a 1394a compliant IRM or bus
> manager exists and allocated that channel.
>
> Why introduce entirely unnecessary fragility?
Looking at spec again, indeed 31 the the default broadcast channel, and
probably nobody ever changes it by writing the BROADCAST_CHANNEL.
The BROADCAST_CHANNEL register was actually added in 1394a.
>
> > Speaking of IRM discovery, the spec says it should be a node with
> > contender bit and largest node id. However, the code in
> > core-topology.c, build_tree seems to take the node that sent the
> > selfID packet last.
>
> This is because of a monotony rule of how self ID packets arrive during
> self identification phase.
Ah, ok, found this bit in spec too.
Btw, according to spec the firewire appears to be half-duplex.
Also one note I wanted to say in previous mail. but forgot.
The IP over 1394 tells that unicast can also use ISO transactions.
However, it doesn't say how to negotiate the ISO channel number, thus a
logical conclusion is that its not possible to use ISO transactions for
unicast.
Is that right?
Best regards,
Maxim Levitsky
On Nov 16 Maxim Levitsky wrote:
> The IP over 1394 tells that unicast can also use ISO transactions.
> However, it doesn't say how to negotiate the ISO channel number, thus a
> logical conclusion is that its not possible to use ISO transactions for
> unicast.
> Is that right?
To my understanding, any unicast transport method other than block write
to the unicast FIFO are outside the scope of RFC 2734. At the end of
section 7:
"""
Unicast IP datagrams whose quality of service is best-effort SHALL be
contained within the data payload of 1394 block write transactions [...]
"""
"""
Unicast IP datagrams that require quality of service other than
best-effort are beyond the scope of this standard.
"""
--
Stefan Richter
-=====-==-=- =-== =----
http://arcgraph.de/sr/