On Wed, Mar 29, 2006 at 10:35:10AM -0800, Mingming Cao wrote:
> On Wed, 2006-03-29 at 11:13 +0200, Laurent Vivier wrote:
> >
> > You're right, Mingming.
> >
> > But I think instead of thinking to change "long" by "long long" we
> > should think about changing "long" by "unsigned long" in the per-cpu
> > counter structure.
> >
> > Is there someone knowing why this counter is signed ?
>
> I am wondering the same thing asked by Laurent. Initially I thought the
> signed value is there to prevent overflow, or to maintain a "int" type
> counters. Are those the intentions, kiran?
I don't know if the local counter version values can be unsigned in this
case. Consider a case like this with the initial counter value to be 0,
and FBC_BATCH is 32 (8cpusx4)
cpu 1 cpu 2 cpu 3
-------- ------- --------
add(10)
//local = 10 fbc = 0.
sub(5)
//local = -5 fbc = 0
add(31)
//local = 31 fbc = 0
sub(30)
//local = 0 fbc = -35
--------------->(A)
Now if the local counters were unsigned, and the global counters unsigned
too, counter read at A would result in a large value, which would mislead
the app. Maybe it doesn't matter if we use percpu_counter_exceeds at
critical places, so these get caught, but that would mean going on all cpus
more often than before..and that would also mean weird values when we just
use percpu_counter_read to print these counters.
So maybe using long long is a simpler solution here? Andrew, thoughts?
>
> But it seems the per cpu counters used in ext2/3 are all number of free
> blocks/inodes/directories. So it should be always positive values. It
> seems fine to change the percpu counters to type "unsigned long" for
> ext2/3 itself. But I am not sure if this will cause issues for other
> users of percpu counters. Kiran, could you please confirm this?
I guess most of the uses for per-cpu counters will be up counters, we don't
need the signedness if it wasn't for the issues above. The nr_files,
memory_allocated counters are up counters too.
Thanks,
Kiran
On Wed, 2006-03-29 at 12:00 -0800, Ravikiran G Thirumalai wrote:
> On Wed, Mar 29, 2006 at 10:35:10AM -0800, Mingming Cao wrote:
> > On Wed, 2006-03-29 at 11:13 +0200, Laurent Vivier wrote:
> > >
> > > You're right, Mingming.
> > >
> > > But I think instead of thinking to change "long" by "long long" we
> > > should think about changing "long" by "unsigned long" in the per-cpu
> > > counter structure.
> > >
> > > Is there someone knowing why this counter is signed ?
> >
> > I am wondering the same thing asked by Laurent. Initially I thought the
> > signed value is there to prevent overflow, or to maintain a "int" type
> > counters. Are those the intentions, kiran?
>
> I don't know if the local counter version values can be unsigned in this
> case. Consider a case like this with the initial counter value to be 0,
> and FBC_BATCH is 32 (8cpusx4)
>
> cpu 1 cpu 2 cpu 3
> -------- ------- --------
> add(10)
> //local = 10 fbc = 0.
> sub(5)
> //local = -5 fbc = 0
> add(31)
> //local = 31 fbc = 0
>
> sub(30)
> //local = 0 fbc = -35
> --------------->(A)
>
> Now if the local counters were unsigned, and the global counters unsigned
> too, counter read at A would result in a large value, which would mislead
> the app.
I was thinking to change the global count to "unsigned long", but we
still need to use signed value (long) for the per cpu counters(local
counter), as they are relative values and could be negative.
Something like this:
struct percpu_counter {
spinlock_t lock;
- long count;
+ unsigned long count;
long *counters;
};
This works for ext2/3, as the global value always initialized to some
positive value (e.g. the # of free blocks when the filesystem is
created). But I am concerned the current other two users of percpu
counters(nr_files in VFS and memory_allocated in network code), where
the global value could be initilized to 0, and will have the issue that
you just described.
> Maybe it doesn't matter if we use percpu_counter_exceeds at
> critical places, so these get caught, but that would mean going on all cpus
> more often than before..and that would also mean weird values when we just
> use percpu_counter_read to print these counters.
>
Wild suggestion, how about we don't update the global counter is the
result is negative?
> So maybe using long long is a simpler solution here? Andrew, thoughts?
>
> >
> > But it seems the per cpu counters used in ext2/3 are all number of free
> > blocks/inodes/directories. So it should be always positive values. It
> > seems fine to change the percpu counters to type "unsigned long" for
> > ext2/3 itself. But I am not sure if this will cause issues for other
> > users of percpu counters. Kiran, could you please confirm this?
>
> I guess most of the uses for per-cpu counters will be up counters, we don't
> need the signedness if it wasn't for the issues above. The nr_files,
> memory_allocated counters are up counters too.
>
Okey, that's good to know.
> Thanks,
> Kiran
On Wed, Mar 29, 2006 at 12:38:37PM -0800, Mingming Cao wrote:
> On Wed, 2006-03-29 at 12:00 -0800, Ravikiran G Thirumalai wrote:
> > On Wed, Mar 29, 2006 at 10:35:10AM -0800, Mingming Cao wrote:
> >
> Wild suggestion, how about we don't update the global counter is the
> result is negative?
You mean just keep the local version even below -FBC_BATCH
and only empty it to the global unsigned counter if the result is going to
be +ve? That would work I think.
Thanks,
Kiran