bin/48843: sh(1): break/continue/return broken inside dot commands

Post by Richard Hansen
I will bring this up during the next Austin Group teleconference. We
should be able to get some improved wording in before POSIX Issue 7 TC2
is published (even if that wording is simply "unspecified" or
"implementation defined"). Any input from the NetBSD community would be
appreciated.
The intended behavior of break/continue outside of a loop is also
unclear. I'll bring that up as well.

That would be great, these things really should be clarified. This
probably belongs to "shell execution environment"? That is, whether
function and loop nesting levels are part of it or not. How about local
variables, or were they just a common extension that is not specified
in POSIX?

While we're on the subject of clarity, I'm not sure if the value $0
inside a dot command has been specified. It could be implied by
"current execution environment" but perhaps it wouldn't hurt to be
explicit.

Because I read the standard to mean that break and continue should have
an effect outside the sourced file, that's how I implemented it. For what
it's worth, this also seems to be what bash does.

The behavior of existing implementations will strongly influence the
direction the Austin Group takes when revising the text. With that
said, what behavior would you like POSIX to specify?

I have no opinion one way or another. I just took the easiest way to fix
existing breakage. I quickly tested with bash, ksh and zsh with a trivial
case: dot command in a while loop contains a break. bash does as my change
does: the loop is broken. ksh complains and seems to ignore the break.
zsh's behaviour is probably best described as "undefined", because it
complains, stops sourcing the file and yet does not break the loop.

Post by Richard Hansen
Sourcing a file with the dot command and running a function are very
similar; how do your changes affect how the shell behaves when calling a
function that has break/continue in the body?

My changes retain the previous behavior, which was to allow a break or
a continue in function to affect a loop that contains the function call.
Quick check for other shells: bash and zsh work the same, ksh complains and
does not break the loop nor return from the function.

How the function call worked was one more reason I chose to implement
the dot command fix as I did: one could think this implementation of
a dot command as a function call that has no parameters (except "set --")
and that the function's body is stored in a file.

+A non-obvious consequence of the file executing in the current environment
+is that loop control keywords (continue and break) can be used in the file
+to control loops surrounding the dot command.

I agree, that would be better. Perhaps:

The POSIX standard is unclear on how loop control keywords (break
and continue) behave across a dot command boundary. This
implementation (currently) allows them to control loops surrounding
the dot command.

+The effects of using a return command outside a function or a dot command
+are not standardized.

One can argue that the effects are standardized as "unspecified". :)
How about: "The POSIX standard says that the results of running 'return'
outside a function or dot script are unspecified. This implementation..."

Yes, of course, my bad. Your formulation would be better.

christos, you made the previous commits, could you do those language
tweaks? Or should I prepare another patch?

--
Jarmo Jaakkola

David Holland

2014-05-31 23:55:48 UTC

Post by Jarmo Jaakkola
How the function call worked was one more reason I chose to implement
the dot command fix as I did: one could think this implementation of
a dot command as a function call that has no parameters (except "set --")
and that the function's body is stored in a file.

Well... implementations other than netbsd's sh seem to allow passing
parameters to a dot command:

valkyrie% cat y2
. ./y a b c d e
valkyrie% cat y
echo '$0 is '"$0"'; $@ is '"$@"
valkyrie% sh ./y2
$0 is ./y2; $@ is
valkyrie% ksh ./y2
$0 is ./y2; $@ is a b c d e
valkyrie% bash ./y2
$0 is ./y2; $@ is a b c d e
valkyrie% zsh ./y2
$0 is ./y; $@ is a b c d e
valkyrie%

although apparently only zsh changes $0.

--
David A. Holland
***@netbsd.org

Christos Zoulas

2014-06-01 02:31:23 UTC

Post by Jarmo Jaakkola
Yes, of course, my bad. Your formulation would be better.
christos, you made the previous commits, could you do those language
tweaks? Or should I prepare another patch?

Sure, prepare another patch!

Thanks,

christos

Richard Hansen

2014-06-02 17:21:17 UTC

Post by Jarmo Jaakkola

I filed a bug report that will be the basis for discussion during the
Thursday teleconference (assuming we have time to address this bug):
http://austingroupbugs.net/view.php?id=842

Post by Jarmo Jaakkola
That would be great, these things really should be clarified. This
probably belongs to "shell execution environment"? That is, whether
function and loop nesting levels are part of it or not.

I'll think about it more when I write up the proposed changes, but I
don't think that the Shell Execution Environment section is the
appropriate place. That section is all about how the shell forks (or
the equivalent) to run commands. The purpose of the "in the current
environment" phrase in the dot command's description is to let you know
that the script isn't run in a subshell or as a separate process, so it
can modify variable assignments, etc.

If dot was meant to act like C's #include and read a separate file's
lines as if they were typed where the dot command was typed, then I
think it would have been specified that way.

Post by Jarmo Jaakkola
How about local
variables, or were they just a common extension that is not specified
in POSIX?

Local variables are a common extension. Proposing text to standardize
local variables is on my todo list -- they would help solve some really
problematic parts of the current standard. See [1] and the discussion
at [2].

I should note that anyone is welcome to submit bug reports or proposed
wording changes about any topic. You can even join the next POSIX
telecon if you'd like; see [3].

[1] http://austingroupbugs.net/view.php?id=654
[2] http://thread.gmane.org/gmane.comp.standards.posix.austin.general/6728
[3] http://article.gmane.org/gmane.comp.standards.posix.austin.general/9454

Post by Jarmo Jaakkola
While we're on the subject of clarity, I'm not sure if the value $0
inside a dot command has been specified. It could be implied by
"current execution environment" but perhaps it wouldn't hurt to be
explicit.

The value of $0 in a dot script is specified -- it is the same as it was
before the script was sourced. This is not explicit -- the only way to
arrive at this conclusion is to note that the description of the dot
command does not specify any change to $0. It would be better if this
was made explicit, as it is with functions. I'll add that to my todo
list, or you can file a bug report if you'd like.

Post by Jarmo Jaakkola

Post by Richard Hansen
The behavior of existing implementations will strongly influence the
direction the Austin Group takes when revising the text. With that
said, what behavior would you like POSIX to specify?

I have no opinion one way or another. I just took the easiest way to fix
existing breakage.

Post by Jarmo Jaakkola
I quickly tested with bash, ksh and zsh with a trivial
case: dot command in a while loop contains a break. bash does as my change
does: the loop is broken. ksh complains and seems to ignore the break.
zsh's behaviour is probably best described as "undefined", because it
complains, stops sourcing the file and yet does not break the loop.

Curious -- I had different results when testing 'break' from a dot
script in other shells (I documented my results in the POSIX bug report).

Post by Jarmo Jaakkola

My changes retain the previous behavior, which was to allow a break or
a continue in function to affect a loop that contains the function call.

So the behavior from a function and the behavior from a dot script are
now the same? I like that.

Post by Jarmo Jaakkola
Quick check for other shells: bash and zsh work the same, ksh complains and
does not break the loop nor return from the function.

Hmm, my test results differed from yours in the function case as well as
the dot script case. (Not that it matters much.)

Exactly -- I think it's good for those two cases to be consistent.

-Richard

Richard Hansen

2014-06-05 18:56:00 UTC

I filed a bug report that will be the basis for discussion during the
http://austingroupbugs.net/view.php?id=842

That bug has been resolved, with the revised text (which will be in
Issue 7 TC2) available here:
http://austingroupbugs.net/view.php?id=842#c2257

We decided to not do anything for Issue 8 at this time, so the behavior
will continue to be unspecified if n is greater than the number of
lexically enclosing loops. The argument is that the standard has been
this way for 20+ years without any major complaints, so why force
implementations to change their code to support a case that few care about.

-Richard

David Holland

2014-05-31 23:51:35 UTC

Whether break and continue should work from the sourced
file might be debatable. Because the dot command says "in the current
environment", I'd say yes.

Offhand, I would say that continues and breaks should be statically
scoped; dynamic scoping is almost always a mistake. So you certainly
shouldn't be able to break from a loop by calling a function that
contains a break outside a loop. (Although netbsd's sh, bash, and zsh
all seem to allow this, I would call it a bug. ksh rejects it.)

How this applies to a sourced file isn't so clear though, at least
offhand, as the point of sourcing a file is to read and evaluate it
within the current context. My inclination would be that sourcing a
file is not the same as calling a function; however, I'm far from an
expert on sh.

It seems that the behavior of sourcing with respect to $0 and $@
varies among implementations, which doesn't make me happy.

netbsd's sh seems to accept it silently; ksh, bash, and zsh all reject
it. I would consider our sh broken.

Because I read the standard to mean that break and continue should have
an effect outside the sourced file, that's how I implemented it. For what
it's worth, this also seems to be what bash does.

The behavior of existing implementations will strongly influence the
direction the Austin Group takes when revising the text. With that
said, what behavior would you like POSIX to specify?

With stuff like this, I'd rather fix our implementation (or have it be
noncompliant until fixed) than standardize unprincipled behavior. FWIW.

--
David A. Holland
***@netbsd.org

Rhialto

2014-06-01 21:19:17 UTC

Post by Richard Hansen
Not necessarily. POSIX does not define "enclosing loop", so it could be
interpreted as syntactic enclosure (a break/continue command must be a
command in the compound list associated with the loop for the loop to
qualify as enclosing the command) or logical enclosure as experienced
during execution. I can see pros and cons to either behavior.

I agree. The other way is clearly unworkable. A function with a break or
continue outside a loop would be incorrect to call except inside a loop.
That is ludicrous.

Post by David Holland
How this applies to a sourced file isn't so clear though, at least
offhand, as the point of sourcing a file is to read and evaluate it
within the current context. My inclination would be that sourcing a
file is not the same as calling a function; however, I'm far from an
expert on sh.

I would say here too that static checking should be the norm. Except
that there is an inherent non-staticness here: the file could have been
generated dynamically, or the name of the file could be dynamic, or
perhaps even more annoying non-static things.

-Olaf.

--
___ Olaf 'Rhialto' Seibert -- The Doctor: No, 'eureka' is Greek for
\X/ rhialto/at/xs4all.nl -- 'this bath is too hot.'

Richard Hansen

2014-06-02 17:54:04 UTC

Whether break and continue should work from the sourced
file might be debatable. Because the dot command says "in the current
environment", I'd say yes.

I think I also prefer lexical scoping, but I can see some valid (though
unusual) uses for dynamic scoping.

Post by David Holland
How this applies to a sourced file isn't so clear though, at least
offhand, as the point of sourcing a file is to read and evaluate it
within the current context.

Yes and no -- there are subtle differences between the dot command and
C's #include preprocessor directive (exit status of the dot command,
in-line variable assignments before the dot command, redirection).

Post by David Holland
My inclination would be that sourcing a
file is not the same as calling a function; however, I'm far from an
expert on sh.

I don't really know the history, but my impression is that the dot
command was intended to make it possible to split common complex tasks
out into modular, reusable scripts. That sounds more like a function
and less like a preprocessor include to me, though the differences
aren't very significant.

Post by David Holland
varies among implementations, which doesn't make me happy.

They're all consistent if you don't specify any additional arguments to
the dot command. Zsh does something different with $0, but it's an
incompatible scripting language in many ways unless run in sh emulation
mode (e.g., it doesn't do word splitting by default -- a drastic
departure from POSIX shell).

netbsd's sh seems to accept it silently; ksh, bash, and zsh all reject
it. I would consider our sh broken.

Yes, although not necessarily non-conformant. I'd say that the behavior
of break and continue outside of a loop should be specified as
undefined, which would allow implementations to do whatever they want
(error out, silently ignore it, give it a special meaning as a fancy
extension, etc.).

Because I read the standard to mean that break and continue should have
an effect outside the sourced file, that's how I implemented it. For what
it's worth, this also seems to be what bash does.

The behavior of existing implementations will strongly influence the
direction the Austin Group takes when revising the text. With that
said, what behavior would you like POSIX to specify?

With stuff like this, I'd rather fix our implementation (or have it be
noncompliant until fixed) than standardize unprincipled behavior. FWIW.

OK, so that's one one vote for static/lexical scoping.

Note that unlike variable binding, the behavior of dynamic scoping of
break/continue is a superset of static/lexical scoping (assuming no
closures): If the shell does dynamic scoping of break/continue but
scripts are always written assuming lexical scoping then those scripts
will still work as expected in all cases. If POSIX were to specify
static/lexical scoping then a shell that performs dynamic scoping would
be conformant but with an extension to the standard.

-Richard

Richard Hansen

2014-06-03 16:53:56 UTC

Post by Richard Hansen
Note that unlike variable binding, the behavior of dynamic scoping of
break/continue is a superset of static/lexical scoping (assuming no
closures): If the shell does dynamic scoping of break/continue but
scripts are always written assuming lexical scoping then those scripts
will still work as expected in all cases. If POSIX were to specify
static/lexical scoping then a shell that performs dynamic scoping would
be conformant but with an extension to the standard.

Oops -- as pointed out in [1] this is incorrect when the argument to
break/continue is greater than the number of lexically enclosing loops.
An author could write 'break 1000' to make it obvious to other readers
that all loops should be broken. A script like this might behave
differently on a shell that does dynamic scoping vs. a shell that does
lexical scoping.

So it looks like POSIX will probably have to choose either lexical
(static) or dynamic for the next major version of the standard (Issue 8).

David voiced his preference for lexical [2]; does anyone else have an
opinion?

I think my preference is for dynamic, although I would question the
design decisions of a script author that takes advantage of dynamic
scoping. Why dynamic over lexical? Because functionality is lost by
making it lexical: If a script author wants dynamic but the shell does
lexical, the author is out of luck. If the shell does dynamic but the
author wants lexical, the author can easily count the number of
lexically enclosing loops and limit the argument of break/continue to
that value. Dynamic might also be easier to implement.

Thanks,
Richard

[1] http://article.gmane.org/gmane.comp.standards.posix.austin.general/9467
[2] http://article.gmane.org/gmane.os.netbsd.devel.userlevel/17452

Joerg Sonnenberger

2014-06-03 17:27:38 UTC