2008-08-22 21:39:51

by Steven Rostedt

[permalink] [raw]
Subject: 2.6.26.3-rt3

We are pleased to announce the 2.6.26.3-rt3 tree, which can be
downloaded from the location:

http://rt.et.redhat.com/download/

Information on the RT patch can be found at:

http://rt.wiki.kernel.org/index.php/Main_Page

Changes since 2.6.26-rt2

- patch merge fix (Steven Rostedt)

- fix net core sock locking (Chirag Jog)

- namespace lock fixes (Chirag Jog)

- hrtimers stuck in waitqueue fix (Thomas Gleixner)


to build a 2.6.26.3-rt3 tree, the following patches should be applied:

http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.26.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/patch-2.6.26.3.bz2
http://rt.et.redhat.com/download/patch-2.6.26.3-rt3.bz2



And like always, my RT version of Matt Mackall's ketchup will get this
for you nicely:

http://people.redhat.com/srostedt/rt/tools/ketchup-0.9.8-rt3


The broken out patches are also available.


-- Steve


2008-08-22 23:40:07

by John Kacur

[permalink] [raw]
Subject: Re: 2.6.26.3-rt3

On Fri, Aug 22, 2008 at 11:39 PM, Steven Rostedt <[email protected]> wrote:
> We are pleased to announce the 2.6.26.3-rt3 tree, which can be
> downloaded from the location:
>
> http://rt.et.redhat.com/download/
>
> Information on the RT patch can be found at:
>
> http://rt.wiki.kernel.org/index.php/Main_Page
>
> Changes since 2.6.26-rt2
>
> - patch merge fix (Steven Rostedt)
>
> - fix net core sock locking (Chirag Jog)

Actually Peter Zijlstra. (Chirag was just first in the email thread)

>
> - namespace lock fixes (Chirag Jog)
>
> - hrtimers stuck in waitqueue fix (Thomas Gleixner)
>
>
> to build a 2.6.26.3-rt3 tree, the following patches should be applied:
>
> http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.26.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/patch-2.6.26.3.bz2
> http://rt.et.redhat.com/download/patch-2.6.26.3-rt3.bz2
>
>
>
> And like always, my RT version of Matt Mackall's ketchup will get this
> for you nicely:
>
> http://people.redhat.com/srostedt/rt/tools/ketchup-0.9.8-rt3
>
>
> The broken out patches are also available.
>

One more patch that was missed - it was discussed here
http://marc.info/?l=linux-rt-users&m=121846031913931&w=2

I am resending it, please consider for -rt4.
Without it I continue to get the following type of message.

BUG: using smp_processor_id() in preemptible [00000000] code: firefox-bin/3912
caller is __qdisc_run+0x160/0x1e9
Pid: 3912, comm: firefox-bin Tainted: G W 2.6.26.3-rt2 #6

Call Trace:
[<ffffffff8033cc96>] debug_smp_processor_id+0xde/0xec
[<ffffffff803f9a87>] __qdisc_run+0x160/0x1e9
[<ffffffff803e8777>] dev_queue_xmit+0x1b3/0x2ee
[<ffffffff8040df1e>] ip_finish_output+0x29b/0x2e4
[<ffffffff8040e04a>] ip_output+0xe3/0xec
[<ffffffff8040cf9c>] ip_local_out+0x25/0x29
[<ffffffff8040d80e>] ip_queue_xmit+0x2ce/0x35e
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8027f844>] ? trace_preempt_on+0x1f/0xf9
[<ffffffff8041e9de>] ? tcp_transmit_skb+0x72a/0x78f
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8041ea04>] tcp_transmit_skb+0x750/0x78f
[<ffffffff802ac050>] ? kmem_cache_alloc_node+0x11e/0x145
[<ffffffff804215a6>] __tcp_push_pending_frames+0x74a/0x860
[<ffffffff803e2c55>] ? __alloc_skb+0x70/0x136
[<ffffffff804158f1>] tcp_sendmsg+0x941/0xa5f
[<ffffffff802c01f3>] ? __pollwait+0x0/0xe5
[<ffffffff803dbd3e>] sock_sendmsg+0x102/0x125
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff80281a97>] ? tracing_hist_preempt_stop+0x2cb/0x2f5
[<ffffffff80272624>] ? __rcu_read_unlock+0x93/0xa7
[<ffffffff802b250e>] ? fget_light+0x97/0xad
[<ffffffff803dc8ee>] sys_sendto+0xe4/0x10c
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff803dc92a>] sys_send+0x14/0x16
[<ffffffff803f7dbb>] compat_sys_socketcall+0xd2/0x16c
[<ffffffff80224a17>] sysenter_do_call+0x8c/0x149
[<ffffffff8045f4ec>] ? trace_hardirqs_on_thunk+0x3a/0x3c

---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff8033cc43>] .... debug_smp_processor_id+0x8b/0xec
.....[<ffffffff803f9a87>] .. ( <= __qdisc_run+0x160/0x1e9)

BUG: firefox-bin:3912 task might have lost a preemption check!
Pid: 3912, comm: firefox-bin Tainted: G W 2.6.26.3-rt2 #6

Call Trace:
[<ffffffff80462e79>] ? sub_preempt_count+0xd1/0xe6
[<ffffffff80233bb1>] preempt_enable_no_resched+0x5c/0x5e
[<ffffffff8033cc9b>] debug_smp_processor_id+0xe3/0xec
[<ffffffff803f9a87>] __qdisc_run+0x160/0x1e9
[<ffffffff803e8777>] dev_queue_xmit+0x1b3/0x2ee
[<ffffffff8040df1e>] ip_finish_output+0x29b/0x2e4
[<ffffffff8040e04a>] ip_output+0xe3/0xec
[<ffffffff8040cf9c>] ip_local_out+0x25/0x29
[<ffffffff8040d80e>] ip_queue_xmit+0x2ce/0x35e
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8027f844>] ? trace_preempt_on+0x1f/0xf9
[<ffffffff8041e9de>] ? tcp_transmit_skb+0x72a/0x78f
[<ffffffff804215a6>] ? __tcp_push_pending_frames+0x74a/0x860
[<ffffffff8041ea04>] tcp_transmit_skb+0x750/0x78f
[<ffffffff802ac050>] ? kmem_cache_alloc_node+0x11e/0x145
[<ffffffff804215a6>] __tcp_push_pending_frames+0x74a/0x860
[<ffffffff803e2c55>] ? __alloc_skb+0x70/0x136
[<ffffffff804158f1>] tcp_sendmsg+0x941/0xa5f
[<ffffffff802c01f3>] ? __pollwait+0x0/0xe5
[<ffffffff803dbd3e>] sock_sendmsg+0x102/0x125
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff80281a97>] ? tracing_hist_preempt_stop+0x2cb/0x2f5
[<ffffffff80272624>] ? __rcu_read_unlock+0x93/0xa7
[<ffffffff802b250e>] ? fget_light+0x97/0xad
[<ffffffff803dc8ee>] sys_sendto+0xe4/0x10c
[<ffffffff8024c3b3>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff8021251c>] ? native_sched_clock+0x2a/0x72
[<ffffffff803dc92a>] sys_send+0x14/0x16
[<ffffffff803f7dbb>] compat_sys_socketcall+0xd2/0x16c
[<ffffffff80224a17>] sysenter_do_call+0x8c/0x149
[<ffffffff8045f4ec>] ? trace_hardirqs_on_thunk+0x3a/0x3c

---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------


Thank You.


Attachments:
(No filename) (5.14 kB)
qdisc_run.patch (1.04 kB)
Download all attachments

2008-08-23 01:27:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: 2.6.26.3-rt3



On Sat, 23 Aug 2008, John Kacur wrote:
>
> One more patch that was missed - it was discussed here
> http://marc.info/?l=linux-rt-users&m=121846031913931&w=2
>
> I am resending it, please consider for -rt4.
> Without it I continue to get the following type of message.
>

Actually this was left out intentionally. The quick fix would be the one
that Gregory Haskins suggested about not reporting when the task is bound
to one CPU. This bothers me a little, but is probably OK for now. A task
can still migrate from one cpu to another by the user, and this can cause
havic if a smp_processor_id is used.

I needed to examine this a bit closer before coming up with a proper fix.

-- Steve

2008-08-23 01:36:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: 2.6.26.3-rt3


On Fri, 22 Aug 2008, Steven Rostedt wrote:

>
>
> On Sat, 23 Aug 2008, John Kacur wrote:
> >
> > One more patch that was missed - it was discussed here
> > http://marc.info/?l=linux-rt-users&m=121846031913931&w=2
> >
> > I am resending it, please consider for -rt4.
> > Without it I continue to get the following type of message.
> >
>
> Actually this was left out intentionally. The quick fix would be the one
> that Gregory Haskins suggested about not reporting when the task is bound
> to one CPU. This bothers me a little, but is probably OK for now. A task
> can still migrate from one cpu to another by the user, and this can cause
> havic if a smp_processor_id is used.
>
> I needed to examine this a bit closer before coming up with a proper fix.


This patch below should be sufficient. I just changed your local_irqs_save
to preempt_disabled: I have this queued for -rt4, but that is where I also
plan on adding the latest ftrace updates so it may take a bit to get it
out.

-- Steve


=======
From: Steven Rostedt <[email protected]>
Subject: suppress warning of smp_processor_id use.

John Kacur pointed out that the get_cpu_var used in net/sched/sch_generic.c
would trigger warnings. This was happing on a statistic variable and
by a softirq which is bound to a single thread.

John sent a patch that used local_irq_save which is a little bit of
overkill. This version uses preempt disable, but we still need to create
a preempt_disable_rt API that is only activated when PREEMPT_RT is configured.

Signed-off-by: Steven Rostedt <[email protected]>
---
net/sched/sch_generic.c | 2 ++
1 file changed, 2 insertions(+)

Index: linux-2.6.26.3-rt3/net/sched/sch_generic.c
===================================================================
--- linux-2.6.26.3-rt3.orig/net/sched/sch_generic.c 2008-08-22 21:28:50.000000000 -0400
+++ linux-2.6.26.3-rt3/net/sched/sch_generic.c 2008-08-22 21:29:38.000000000 -0400
@@ -112,7 +112,9 @@ static inline int handle_dev_cpu_collisi
* Another cpu is holding lock, requeue & delay xmits for
* some time.
*/
+ preempt_disable(); /* FIXME: we need an _rt version of this */
__get_cpu_var(netdev_rx_stat).cpu_collision++;
+ preempt_enable();
ret = dev_requeue_skb(skb, dev, q);
}

2008-08-23 11:33:28

by John Kacur

[permalink] [raw]
Subject: Re: 2.6.26.3-rt3

On Sat, Aug 23, 2008 at 3:36 AM, Steven Rostedt <[email protected]> wrote:
>
> On Fri, 22 Aug 2008, Steven Rostedt wrote:
>
>>
>>
>> On Sat, 23 Aug 2008, John Kacur wrote:
>> >
>> > One more patch that was missed - it was discussed here
>> > http://marc.info/?l=linux-rt-users&m=121846031913931&w=2
>> >
>> > I am resending it, please consider for -rt4.
>> > Without it I continue to get the following type of message.
>> >
>>
>> Actually this was left out intentionally. The quick fix would be the one
>> that Gregory Haskins suggested about not reporting when the task is bound
>> to one CPU. This bothers me a little, but is probably OK for now. A task
>> can still migrate from one cpu to another by the user, and this can cause
>> havic if a smp_processor_id is used.

Hmnn, the e-mail thread died out at that point and moved to IRC after
we released that
debug_smp_processor_id(void)
ALREADY had a check to see if the kernel thread was bound to a single
cpu. (see below)

/*
* Kernel threads bound to a single CPU can safely use
* smp_processor_id():
*/
this_mask = cpumask_of_cpu(this_cpu);

if (cpus_equal(current->cpus_allowed, this_mask))
goto out;

I'm inclined to believe that the test is correct and we really were
calling smp_processor_id in preemptible code which is why I offered my
patch up again.

>>
>> I needed to examine this a bit closer before coming up with a proper fix.
>
>
> This patch below should be sufficient. I just changed your local_irqs_save
> to preempt_disabled: I have this queued for -rt4, but that is where I also

Ok, good, I'm running with your lighter weight patch to make sure that
it is sufficient. Thanks.

> plan on adding the latest ftrace updates so it may take a bit to get it
> out.
>
> -- Steve
>
>
> =======
> From: Steven Rostedt <[email protected]>
> Subject: suppress warning of smp_processor_id use.
>
> John Kacur pointed out that the get_cpu_var used in net/sched/sch_generic.c
> would trigger warnings. This was happing on a statistic variable and
> by a softirq which is bound to a single thread.
>
> John sent a patch that used local_irq_save which is a little bit of
> overkill. This version uses preempt disable, but we still need to create
> a preempt_disable_rt API that is only activated when PREEMPT_RT is configured.
>

Could you perhaps leave in my original Signed-off-by: or at least convert it to
Debugged-by: John Kacur <jkacur at gmail dot com>

> Signed-off-by: Steven Rostedt <[email protected]>
> ---
> net/sched/sch_generic.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Index: linux-2.6.26.3-rt3/net/sched/sch_generic.c
> ===================================================================
> --- linux-2.6.26.3-rt3.orig/net/sched/sch_generic.c 2008-08-22 21:28:50.000000000 -0400
> +++ linux-2.6.26.3-rt3/net/sched/sch_generic.c 2008-08-22 21:29:38.000000000 -0400
> @@ -112,7 +112,9 @@ static inline int handle_dev_cpu_collisi
> * Another cpu is holding lock, requeue & delay xmits for
> * some time.
> */
> + preempt_disable(); /* FIXME: we need an _rt version of this */

I think you should drop the FIXME comment. The discussion was about
the debug version of smp_processor_id that is called via
__get_cpu_var() below, and it appears that it doesn't need any
fix-ups.
(Of course there is the chance that I misunderstood and you really do
mean some changes to preempt_disable() - let me know pls)

> __get_cpu_var(netdev_rx_stat).cpu_collision++;
> + preempt_enable();
> ret = dev_requeue_skb(skb, dev, q);
> }
>
>