Hey peeps - some of you may have already noticed that in my public git
tree, the "v5.12-rc1" tag has magically been renamed to
"v5.12-rc1-dontuse". It's still the same object, it still says
"v5.12-rc1" internally, and it is still is signed by me, but the
user-visible name of the tag has changed.
The reason is fairly straightforward: this merge window, we had a very
innocuous code cleanup and simplification that raised no red flags at
all, but had a subtle and very nasty bug in it: swap files stopped
working right. And they stopped working in a particularly bad way:
the offset of the start of the swap file was lost.
Swapping still happened, but it happened to the wrong part of the
filesystem, with the obvious catastrophic end results.
Now, the good news is even if you do use swap (and hey, that's nowhere
near as common as it used to be), most people don't use a swap *file*,
but a separate swap *partition*. And the bug in question really only
happens for when you have a regular filesystem, and put a file on it
as a swap.
And, as far as I know, all the normal distributions set things up with
swap partitions, not files, because honestly, swapfiles tend to be
slower and have various other complexity issues.
The bad news is that the reason we support swapfiles in the first
place is that they do end up having some flexibility advantages, and
so some people do use them for that reason. If so, do not use rc1.
Thus the renaming of the tag.
Yes, this is very unfortunate, but it really wasn't a very obvious
bug, and it didn't even show up in normal testing, exactly because
swapfiles just aren't normal. So I'm not blaming the developers in
question, and it also wasn't due to the odd timing of the merge
window, it was just simply an unusually nasty bug that did get caught
and is fixed in the current tree.
But I want everybody to be aware of because _if_ it bites you, it
bites you hard, and you can end up with a filesystem that is
essentially overwritten by random swap data. This is what we in the
industry call "double ungood".
Now, there's a couple of additional reasons for me writing this note
other than just "don't run 5.12-rc1 if you use a swapfile". Because
it's more than just "ok, we all know the merge window is when all the
new scary code gets merged, and rc1 can be a bit scary and not work
for everybody". Yes, rc1 tends to be buggier than later rc's, we are
all used to that, but honestly, most of the time the bugs are much
smaller annoyances than this time.
And in fact, most of our rc1 releases have been so solid over the
years that people may have forgotten that "yeah, this is all the new
code that can have nasty bugs in it".
One additional reason for this note is that I want to not just warn
people to not run this if you have a swapfile - even if you are
personally not impacted (like I am, and probably most people are -
swap partitions all around) - I want to make sure that nobody starts
new topic branches using that 5.12-rc1 tag. I know a few developers
tend to go "Ok, rc1 is out, I got all my development work into this
merge window, I will now fast-forward to rc1 and use that as a base
for the next release". Don't do it this time. It may work perfectly
well for you because you have the common partition setup, but it can
end up being a horrible base for anybody else that might end up
bisecting into that area.
And the *final* reason I want to just note this is a purely git
process one: if you already pulled my git tree, you will have that
"v5.12-rc1" tag, and the fact that it no longer exists in my public
tree under that name changes nothing at all for you. Git is
distributed, and me removing that tag and replacing it with another
name doesn't magically remove it from other copies unless you have
special mirroring code.
So if you have a kernel git tree (and I'm here assuming "origin"
points to my trees), and you do
git fetch --tags origin
you _will_ now see the new "v5.12-rc1-dontuse" tag. But git won't
remove the old v5.12-rc1 tag, because while git will see that it is
not upstream, git will just assume that that simply means that it's
your own local tag. Tags, unlike branch names, are a global namespace
in git.
So you should additionally do a "git tag -d v5.12-rc1" to actually get
rid of the original tag name.
Of course, having the old tag doesn't really do anything bad, so this
git process thing is entirely up to you. As long as you don't _use_
v5.12-rc1 for anything, having the tag around won't really matter, and
having both 'v5.12-rc1' _and_ 'v5.12-rc1-dontuse' doesn't hurt
anything either, and seeing both is hopefully already sufficient
warning of "let's not use that then".
Sorry for this mess,
Linus
Hi!
> Hey peeps - some of you may have already noticed that in my public git
> tree, the "v5.12-rc1" tag has magically been renamed to
> "v5.12-rc1-dontuse". It's still the same object, it still says
> "v5.12-rc1" internally, and it is still is signed by me, but the
> user-visible name of the tag has changed.
>
> The reason is fairly straightforward: this merge window, we had a very
> innocuous code cleanup and simplification that raised no red flags at
> all, but had a subtle and very nasty bug in it: swap files stopped
> working right. And they stopped working in a particularly bad way:
> the offset of the start of the swap file was lost.
>
> Swapping still happened, but it happened to the wrong part of the
> filesystem, with the obvious catastrophic end results.
Fun :-(.
> One additional reason for this note is that I want to not just warn
> people to not run this if you have a swapfile - even if you are
> personally not impacted (like I am, and probably most people are -
> swap partitions all around) - I want to make sure that nobody starts
> new topic branches using that 5.12-rc1 tag. I know a few developers
> tend to go "Ok, rc1 is out, I got all my development work into this
> merge window, I will now fast-forward to rc1 and use that as a base
> for the next release". Don't do it this time. It may work perfectly
> well for you because you have the common partition setup, but it can
> end up being a horrible base for anybody else that might end up
> bisecting into that area.
Would it make sense to do a -rc2, now, so new topic branches can be
started on that one?
Best regards,
Pavel
--
http://www.livejournal.com/~pavelmachek
On 04/03/2021 13:43, Pavel Machek wrote:
>> One additional reason for this note is that I want to not just warn
>> people to not run this if you have a swapfile - even if you are
>> personally not impacted (like I am, and probably most people are -
>> swap partitions all around) - I want to make sure that nobody starts
>> new topic branches using that 5.12-rc1 tag. I know a few developers
>> tend to go "Ok, rc1 is out, I got all my development work into this
>> merge window, I will now fast-forward to rc1 and use that as a base
>> for the next release". Don't do it this time. It may work perfectly
>> well for you because you have the common partition setup, but it can
>> end up being a horrible base for anybody else that might end up
>> bisecting into that area.
>
> Would it make sense to do a -rc2, now, so new topic branches can be
> started on that one?
+1 to this idea. I already applied few patches, well, on top of
v5.12-rc1, so would be nice if I stop this sooner than later.
Best regards,
Krzysztof
Hi Linus,
On Thu, Mar 4, 2021 at 1:59 PM Linus Torvalds
<[email protected]> wrote:
> And, as far as I know, all the normal distributions set things up with
> swap partitions, not files, because honestly, swapfiles tend to be
> slower and have various other complexity issues.
Looks like this has changed in at least Ubuntu: my desktop machine,
which got Ubuntu 18.04LTS during initial installation, is using a (small)
swapfile instead of a swap partition.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Hi David,
On Thu, Mar 4, 2021 at 5:56 PM David Laight <[email protected]> wrote:
> > On Thu, Mar 4, 2021 at 1:59 PM Linus Torvalds
> > <[email protected]> wrote:
> > > And, as far as I know, all the normal distributions set things up with
> > > swap partitions, not files, because honestly, swapfiles tend to be
> > > slower and have various other complexity issues.
> >
> > Looks like this has changed in at least Ubuntu: my desktop machine,
> > which got Ubuntu 18.04LTS during initial installation, is using a (small)
> > swapfile instead of a swap partition.
>
> My older ubuntu (13.04) didn't have swap at all.
IIRC, the small swapfile was the default suggestion. I don't really
need swap (yummy, 53 GiB in buff/cache ;-)
> I had to add some when running multiple copies of the Altera
> fpga software started causing grief.
> That will be a file.
Or switch FPGA, and use yosys ;-)
> After all once you start swapping it is all horrid and slow.
> Swap to file may be slower, but apart from dumping out inactive
> pages you really don't want to be doing it - so it doesn't matter.
>
> Historically swap was a partition and larger than physical memory.
> This allows simple 'dump to swap' on panic (for many disk types).
> But I've not seen that support in linux.
I know. We started with lots of small partitions, but nowadays the
distros wan't to install everything in a single[*] partition, even swap.
[*] Ignoring /boot/efi, which didn't exist in the good ol' days.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
> On Thu, Mar 4, 2021 at 1:59 PM Linus Torvalds
> <[email protected]> wrote:
> > And, as far as I know, all the normal distributions set things up with
> > swap partitions, not files, because honestly, swapfiles tend to be
> > slower and have various other complexity issues.
>
> Looks like this has changed in at least Ubuntu: my desktop machine,
> which got Ubuntu 18.04LTS during initial installation, is using a (small)
> swapfile instead of a swap partition.
My older ubuntu (13.04) didn't have swap at all.
I had to add some when running multiple copies of the Altera
fpga software started causing grief.
That will be a file.
After all once you start swapping it is all horrid and slow.
Swap to file may be slower, but apart from dumping out inactive
pages you really don't want to be doing it - so it doesn't matter.
Historically swap was a partition and larger than physical memory.
This allows simple 'dump to swap' on panic (for many disk types).
But I've not seen that support in linux.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Thu, Mar 4, 2021 at 4:43 AM Pavel Machek <[email protected]> wrote:
>
> Would it make sense to do a -rc2, now, so new topic branches can be
> started on that one?
I was planning on doing an rc2 earlier, just not "this minute" early.
I was expecting to see a few more of the normal fixes pulls come in,
and perhaps do it Friday instead of the usual Sunday.
Because regardless of an accelerated rc2, I thought it was much more
important to rename rc1 and let people know to avoid it.
And yes, obviously it was inevitably too late for some people, but
doing an rc2 wouldn't have helped those people either. So the most
important part was making rc1 itself less reachable by doing that
"dontuse" rename.
(And I should probably have done that rename even earlier, but I was
waiting to see if I could get more confirmation that it really was
fixed. And in hindsight that was entirely pointless and stupid of me -
we knew there was some serious rc1 problem, and the renaming had
nothing to do with whether it was fixed or not. Oh well. Water under
the bridge).
But I also can heartily just recommend that people who already _did_
start on rc1 to rebase their current - hopefully not extensive - work.
I know I've ranted about rebasing for years, and it has huge
downsides, but the operation does exist because sometimes you just
need to fix serious errors. So _mindful_ rebasing, understanding why
it shouldn't be a normal thing, but doing it when something
exceptional happens - that's not wrong.
Linus
[CCing the git list]
On Wed, Mar 03, 2021 at 12:53:18PM -0800, Linus Torvalds wrote:
> Hey peeps - some of you may have already noticed that in my public git
> tree, the "v5.12-rc1" tag has magically been renamed to
> "v5.12-rc1-dontuse". It's still the same object, it still says
> "v5.12-rc1" internally, and it is still is signed by me, but the
> user-visible name of the tag has changed.
>
> The reason is fairly straightforward: this merge window, we had a very
> innocuous code cleanup and simplification that raised no red flags at
> all, but had a subtle and very nasty bug in it: swap files stopped
> working right. And they stopped working in a particularly bad way:
> the offset of the start of the swap file was lost.
>
> Swapping still happened, but it happened to the wrong part of the
> filesystem, with the obvious catastrophic end results.
[...]
> One additional reason for this note is that I want to not just warn
> people to not run this if you have a swapfile - even if you are
> personally not impacted (like I am, and probably most people are -
> swap partitions all around) - I want to make sure that nobody starts
> new topic branches using that 5.12-rc1 tag. I know a few developers
> tend to go "Ok, rc1 is out, I got all my development work into this
> merge window, I will now fast-forward to rc1 and use that as a base
> for the next release". Don't do it this time. It may work perfectly
> well for you because you have the common partition setup, but it can
> end up being a horrible base for anybody else that might end up
> bisecting into that area.
Even if people avoid basing their topic branches on 5.12-rc1, it's still
possible for a future bisect to end up wandering to one of the existing
dangerous commits, if someone's trying to find a historical bug and git
happens to choose that as a halfway point. And if they happen to be
using a swap file, they could end up with serious data loss, years from
now when "5.12-rc1 is broken" isn't on the top of their mind or even
something they heard about originally.
Would it make sense to add a feature to git that allows defining a
"dangerous" region for bisect? Rough sketch:
- Add a `/.git-bisect-dangerous` file to the repository, containing a
list of of commit range expressions (contains commit X, doesn't
contain commit Y) and associated messages ("Do not use these kernels
if you have a swap file; if you need to bisect into here, disable swap
files first").
- git-bisect, as it navigates commits, always checks that file for any
commit it processes, and adds any new entries it sees into
`.git/bisect-dangerous`; it never removes entries from there.
- git-bisect avoids choosing bisection points anywhere in that range
until it absolutely has to (because it's narrowed an issue to that
range). This can use something similar to the existing `git bisect
skip` machinery. Manual bisections print the message at that point.
Automated bisections (`git bisect run`) stop and print the range
without narrowing further, unless the user passes something like
`--dangerous-ok=commit-range`.
(git notes would be nice for this, but they're hard to share reliably;
the above mechanism to accumulate entries from a file in the repo seems
simpler. I can imagine other possibilities.)
Does something like this seem potentially reasonable, and worth doing to
help people avoid future catastrophic data loss?
- Josh Triplett
On Fri, Mar 5, 2021 at 1:58 AM Josh Triplett <[email protected]> wrote:
> On Wed, Mar 03, 2021 at 12:53:18PM -0800, Linus Torvalds wrote:
> > One additional reason for this note is that I want to not just warn
> > people to not run this if you have a swapfile - even if you are
> > personally not impacted (like I am, and probably most people are -
> > swap partitions all around) - I want to make sure that nobody starts
> > new topic branches using that 5.12-rc1 tag. I know a few developers
> > tend to go "Ok, rc1 is out, I got all my development work into this
> > merge window, I will now fast-forward to rc1 and use that as a base
> > for the next release". Don't do it this time. It may work perfectly
> > well for you because you have the common partition setup, but it can
> > end up being a horrible base for anybody else that might end up
> > bisecting into that area.
>
> Even if people avoid basing their topic branches on 5.12-rc1, it's still
> possible for a future bisect to end up wandering to one of the existing
> dangerous commits, if someone's trying to find a historical bug and git
> happens to choose that as a halfway point. And if they happen to be
> using a swap file, they could end up with serious data loss, years from
> now when "5.12-rc1 is broken" isn't on the top of their mind or even
> something they heard about originally.
>
> Would it make sense to add a feature to git that allows defining a
> "dangerous" region for bisect? Rough sketch:
> - Add a `/.git-bisect-dangerous` file to the repository, containing a
> list of of commit range expressions (contains commit X, doesn't
> contain commit Y) and associated messages ("Do not use these kernels
> if you have a swap file; if you need to bisect into here, disable swap
> files first").
> - git-bisect, as it navigates commits, always checks that file for any
> commit it processes, and adds any new entries it sees into
> `.git/bisect-dangerous`; it never removes entries from there.
The `git bisect skip` machinery uses `refs/bisect/skip-<commit>` refs
instead of such a file, so I wonder if such a file is needed. It could
be used to store a map between skipped commits and the associated
messages though. Or git notes could be used for that purpose.
By the way I wonder what should happen if a commit is associated with
a message by a `/.git-bisect-dangerous` file, but in another branch
such file associates it with a different message. I guess all the
different messages should be stored, and then displayed.
> - git-bisect avoids choosing bisection points anywhere in that range
> until it absolutely has to (because it's narrowed an issue to that
> range). This can use something similar to the existing `git bisect
> skip` machinery. Manual bisections print the message at that point.
> Automated bisections (`git bisect run`) stop and print the range
> without narrowing further, unless the user passes something like
> `--dangerous-ok=commit-range`.
Yeah, using the `git bisect skip` machinery looks like a good idea.
Instead of `/.git-bisect-dangerous`, the file could actually be called
`/.git-bisect-skip` and could also store ranges where the code doesn't
compile, or completely misbehave, without necessarily being dangerous.
The dangerous status would only be conveyed by the associated messages
then.
Another way could be to directly share some special refs similar to
the existing `refs/bisect/skip-<commit>` refs, instead of a
`/.git-bisect-dangerous` file. This would likely raise some issues
about how to create and share these refs and the associated messages
though.
> (git notes would be nice for this, but they're hard to share reliably;
> the above mechanism to accumulate entries from a file in the repo seems
> simpler. I can imagine other possibilities.)
If the notes are created automatically from the `/.git-bisect-skip`
files and stored in `refs/notes/skip`, then they might not need to be
shared. If people already share notes, they would just need to stop
sharing those in `refs/notes/skip`.
> Does something like this seem potentially reasonable, and worth doing to
> help people avoid future catastrophic data loss?
It seems reasonable as part of the skip mechanism.
...
> > Historically swap was a partition and larger than physical memory.
> > This allows simple 'dump to swap' on panic (for many disk types).
> > But I've not seen that support in linux.
>
> I know. We started with lots of small partitions, but nowadays the
> distros wan't to install everything in a single[*] partition, even swap.
Multiple partitions is partially do to the size of disks.
But I prefer to install the 'system' in one partition and put all
my own files in a different one.
Then I can install a different version of the OS into a 3rd
partition and be able to boot different versions.
Many years ago I updated NetBSD's x86 mbr code to read the
extended partition table so you could choose to read the
next level boot from sector zero of any partition.
Squeezing that into the ~400 bytes available is a masterpiece.
[*] The 'single partition' is a simplicity cop-out.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christian Couder <[email protected]> writes:
>> (git notes would be nice for this, but they're hard to share reliably;
>> the above mechanism to accumulate entries from a file in the repo seems
>> simpler. I can imagine other possibilities.)
>
> If the notes are created automatically from the `/.git-bisect-skip`
> files and stored in `refs/notes/skip`, then they might not need to be
> shared. If people already share notes, they would just need to stop
> sharing those in `refs/notes/skip`.
Ehh, doesn't Josh _want_ to share them, though? I do not know if a
single "refs/notes/bisect-skip" notes would do, or you'd need one
notes tree per the kind of bisection (iow, people may be chasing
different set of bugs, and the commits that need to be skipped while
chasing one bug may be OK to test while chasing another one), but I
would imagine that for this particular use case of marking "these
commits are dangerous to check out and build on", it does not depend
on what you are bisecting to find at all, so sharing would be a
sensible thing to do.
It is trivial for you to fetch the refs/notes/do-not--checkout notes
tree from me and merge it into your refs/notes/do-not--checkout
notes tree, I would think; "git notes merge" may have room for
improvement, but essentially it would just want a union of two
sets, no?
On Fri, Mar 05, 2021 at 10:10:05AM -0800, Junio C Hamano wrote:
> Christian Couder <[email protected]> writes:
>
> >> (git notes would be nice for this, but they're hard to share reliably;
> >> the above mechanism to accumulate entries from a file in the repo seems
> >> simpler. I can imagine other possibilities.)
> >
> > If the notes are created automatically from the `/.git-bisect-skip`
> > files and stored in `refs/notes/skip`, then they might not need to be
> > shared. If people already share notes, they would just need to stop
> > sharing those in `refs/notes/skip`.
>
> Ehh, doesn't Josh _want_ to share them, though? I do not know if a
> single "refs/notes/bisect-skip" notes would do, or you'd need one
> notes tree per the kind of bisection (iow, people may be chasing
> different set of bugs, and the commits that need to be skipped while
> chasing one bug may be OK to test while chasing another one), but I
> would imagine that for this particular use case of marking "these
> commits are dangerous to check out and build on", it does not depend
> on what you are bisecting to find at all, so sharing would be a
> sensible thing to do.
>
> It is trivial for you to fetch the refs/notes/do-not--checkout notes
> tree from me and merge it into your refs/notes/do-not--checkout
> notes tree, I would think; "git notes merge" may have room for
> improvement, but essentially it would just want a union of two
> sets, no?
My primary concern about notes is that they require manual
action/configuration in order to share. I was looking for a solution
that would automatically protect anyone who pulled from linux.git
(directly or indirectly), without them having to specifically take a
separate step to sync this information.
Josh Triplett <[email protected]> writes:
>> It is trivial for you to fetch the refs/notes/do-not--checkout notes
>> tree from me and merge it into your refs/notes/do-not--checkout
>> notes tree, I would think; "git notes merge" may have room for
>> improvement, but essentially it would just want a union of two
>> sets, no?
>
> My primary concern about notes is that they require manual
> action/configuration in order to share. I was looking for a solution
> that would automatically protect anyone who pulled from linux.git
> (directly or indirectly), without them having to specifically take a
> separate step to sync this information.
If "without any configuration" is a hard requirement, then by
definition you'd need to live with what you get from "git clone" and
"git pull" alone, so be it the notes or any other mechanism, there
currently is nothing that lets you do the "skip this part while
bisection".
* Linus Torvalds <[email protected]> wrote:
> But I also can heartily just recommend that people who already _did_
> start on rc1 to rebase their current - hopefully not extensive - work.
Thanks for the heads-up - we just rebased about 50 commits in -tip to
avoid this bug: our normal workflow is to jump on -rc1 once it's
released and integrate pending development work that we normally don't
apply during the merge window. So our special pattern of pent-up
merging did bite us a little bit - but nothing particularly serious.
> I know I've ranted about rebasing for years, and it has huge
> downsides, but the operation does exist because sometimes you just
> need to fix serious errors. So _mindful_ rebasing, understanding why
> it shouldn't be a normal thing, but doing it when something
> exceptional happens - that's not wrong.
Yeah, and in this case not sending scarce-resource testers & bisecters
into the middle of a file corruption bug is definitely the right thing
to do.
Maybe -next could double check that none of the maintainer trees have
an -rc1 base? Your note here was kind of low-key. :-)
And maybe there's some bisection helper annotation or hook-script that
can be embedded in the kernel Git tree to avoid or at least warn about
particularly nasty bugs?
Thanks,
Ingo