Mailing List Archive

Removing dead bytecode vs reporting syntax errors
Hi,

Recently, we moved the optimization for the removal of dead code of the form

if 0:
....

to the ast so we use JUMP bytecodes instead (being completed in PR14116).
The reason is that
currently, any syntax error in the block will never be reported. For
example:

if 0:
return

if 1:
pass
else:
return

while 0:
return


at module level do not raise any syntax error (just some examples), In
https://bugs.python.org/issue37500 it was reported
that after that, code coverage will decrease as coverage.py sees these new
bytecodes (even if they are not executed). In general,
the code object is a bit bigger and the optimization now it requires an
JUMP instruction to be executed, but syntax errors are reported.

The discussion on issue 37500 is about if we should prioritize the
optimization or the correctness of reporting syntax errors. In my opinion,
SyntaxErrors should be reported with independence of the value of variables
(__debug__) or constant as is a property of the code being written
not of the code being executed. Also, as CPython is the reference
implementation of Python, the danger here is that it could be interpreted
that
this optimization is part of the language and its behavior should be
mirrored in every other Python implementation. Elsewhere we have always
prioritize correctness over speed or optimizations.

I am writing this email to know what other people think. Should we revert
the change and not report Syntax Errors on optimized blocks? Someone
sees a viable way of reporting the errors and not emitting the bytecode for
these block?

Regards,
Pablo
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
On 06.07.2019 0:28, Pablo Galindo Salgado wrote:
> Hi,
>
> Recently, we moved the optimization for the removal of dead code of the form
>
> if 0:
> ....
>
> to the ast so we use JUMP bytecodes instead (being completed in PR14116). The reason is that
> currently, any syntax error in the block will never be reported. For example:
>
> if 0:
> return
>
> if 1:
> pass
> else:
> return
>
> while 0:
> return
>
> at module level do not raise any syntax error (just some examples), In https://bugs.python.org/issue37500 it was reported
> that after that, code coverage will decrease as coverage.py sees these new bytecodes (even if they are not executed). In general,
> the code object is a bit bigger and the optimization now it requires an JUMP instruction to be executed, but syntax errors are reported.
>
> The discussion on issue 37500 is about if we should prioritize the optimization or the correctness of reporting syntax errors. In my opinion,
> SyntaxErrors should be reported with independence of the value of variables (__debug__) or constant as is a property of the code being written
> not of the code being executed. Also, as CPython is the reference implementation of Python, the danger here is that it could be
> interpreted that
> this optimization is part of the language and its behavior should be mirrored in every other Python implementation. Elsewhere we have always
> prioritize correctness over speed or optimizations.
>
> I am writing this email to know what other people think. Should we revert the change and not report Syntax Errors on optimized blocks? Someone
> sees a viable way of reporting the errors and not emitting the bytecode for these block?
>
>
"Correctness over speed" is Python's core value, so any syntax errors must be reported.

Is this optimization so important anyway? `if 0:' seems a niche use case (yes, it's in site.py which is in every installation but the gain
there is pretty small)

If it is, why not kill two birds with one stone and discard the subtree, but _after_ it has passed scrunity?

AFAIK the standard approach to optimization is to construct a raw tree first and only then apply any equivalence transformations to it.

--
Regards,
Ivan
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
On Jul 5, 2019, at 17:28, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
> [...]
> I am writing this email to know what other people think. Should we revert the change and not report Syntax Errors on optimized blocks? Someone
> sees a viable way of reporting the errors and not emitting the bytecode for these block?

Just to be clear, the question on the table is whether to revert this change in behavior for 3.8.0 and beyond.

The original change had also been backported and appears in 3.7.4rc1 and rc2. Now that its compatibility implications are clearer, the change will be reverted for 3.7.4 final and will not be considered for future 3.7.x releases.

--
Ned Deily
nad@python.org -- []
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DVJC2LWCFEWHA7VUXWPMEJB4P7QMGBGS/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
Hi,

Until a solution which makes everyone happy can be found, I suggest to
move back to the status quo: revert the change.

More people seems to expect "if 0: ..." to be removed, than people who
care of syntax errors on "if 0".

--

Would it be possible to detect if the "if 0" block would raise a
syntax error, and only remove it if it doesn't raise a syntax error?

That's the approach I chose in my fatoptimizer project which is
implemented as an AST optimizer:
https://github.com/vstinner/fatoptimizer/blob/master/fatoptimizer/dead_code.py

See the tests to see which cases are *not* optimized:
https://github.com/vstinner/fatoptimizer/blob/master/test_fatoptimizer.py#L2428

Some examples (the "dead code elimitaiton" is not only about "if 0",
but also "while 0", dead code after return, etc.):

self.check_dont_optimize("""
def func():
if 0:
yield
""")

self.check_dont_optimize("while 1: x = 1")

self.check_dont_optimize("""
def func(obj):
return
if 0:
yield from obj
""")

self.check_dont_optimize("""
try:
pass
except Exception:
yield 3
""")

See also the doc:
https://fatoptimizer.readthedocs.io/en/latest/optimizations.html#dead-code

--

About code coverage, it seems like -X noopt would help:
https://github.com/python/cpython/pull/13600

But I'm not sure anymore after Ned Batchelder wrote:

"The real-word implications from my world are this: if your code has
"if 0:" clauses in it, and you measure its coverage, then because the
lines have not been optimized away, coverage.py will think the lines
are a possible execution path, and will be considered a miss because
they are not executed. This will reduce your coverage percentage."
https://bugs.python.org/issue37500#msg347362

Does it mean that coverage.py will report even more "false positive"
using -X noopt?

Victor
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SANS5X5R4M5HVOBPYBPS5HASLAXI2U5K/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
It seems that the issue that originally caused compatibility issues was
that `__debug__` statements were being optimized away, which was
apparently desirable from coverage's point of view. It's not clear to
me, but it seems that this may also impact what bytecode is generated
when Python is run in optimized mode, because statements of the form `if
__debug__:` will no longer be completely optimized out under `-O` (note
that from what I can tell, `assert` statements are still optimized out
under `-O`).

Does anyone have performance sensitive code that relies on `if
__debug__` so that we can look at a benchmark? The issues with code
coverage aside, if it's a significant issue, maybe it is worth
considering a special case for `if __debug__` (I don't know enough about
the implementation details to know how difficult or annoying this would
be to maintain).

Best,
Paul

On 7/5/19 5:51 PM, Ivan Pozdeev via Python-Dev wrote:
>
> "Correctness over speed" is Python's core value, so any syntax errors
> must be reported.
>
> Is this optimization so important anyway? `if 0:' seems a niche use
> case (yes, it's in site.py which is in every installation but the gain
> there is pretty small)
>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
> Until a solution which makes everyone happy can be found, I suggest to
move back to the status quo: revert the change.

>More people seems to expect "if 0: ..." to be removed, than people who
care of syntax errors on "if 0".

I don't think this is that clear. As Paul wrote on the issue this is the
result of fixing a bug that has been open since
2008 (11 years), which itself was noticed independently, also in 2008
(#1875 and #1920, respectively).
He also independently discovered the same issue last year when writing some
tests for IPython.

https://bugs.python.org/msg347394

On Fri, 5 Jul 2019 at 23:10, Victor Stinner <vstinner@redhat.com> wrote:

> Hi,
>
> Until a solution which makes everyone happy can be found, I suggest to
> move back to the status quo: revert the change.
>
> More people seems to expect "if 0: ..." to be removed, than people who
> care of syntax errors on "if 0".
>
> --
>
> Would it be possible to detect if the "if 0" block would raise a
> syntax error, and only remove it if it doesn't raise a syntax error?
>
> That's the approach I chose in my fatoptimizer project which is
> implemented as an AST optimizer:
>
> https://github.com/vstinner/fatoptimizer/blob/master/fatoptimizer/dead_code.py
>
> See the tests to see which cases are *not* optimized:
>
> https://github.com/vstinner/fatoptimizer/blob/master/test_fatoptimizer.py#L2428
>
> Some examples (the "dead code elimitaiton" is not only about "if 0",
> but also "while 0", dead code after return, etc.):
>
> self.check_dont_optimize("""
> def func():
> if 0:
> yield
> """)
>
> self.check_dont_optimize("while 1: x = 1")
>
> self.check_dont_optimize("""
> def func(obj):
> return
> if 0:
> yield from obj
> """)
>
> self.check_dont_optimize("""
> try:
> pass
> except Exception:
> yield 3
> """)
>
> See also the doc:
> https://fatoptimizer.readthedocs.io/en/latest/optimizations.html#dead-code
>
> --
>
> About code coverage, it seems like -X noopt would help:
> https://github.com/python/cpython/pull/13600
>
> But I'm not sure anymore after Ned Batchelder wrote:
>
> "The real-word implications from my world are this: if your code has
> "if 0:" clauses in it, and you measure its coverage, then because the
> lines have not been optimized away, coverage.py will think the lines
> are a possible execution path, and will be considered a miss because
> they are not executed. This will reduce your coverage percentage."
> https://bugs.python.org/issue37500#msg347362
>
> Does it mean that coverage.py will report even more "false positive"
> using -X noopt?
>
> Victor
>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
[Pablo Galindo Salgado <pablogsal@gmail.com>]
> Recently, we moved the optimization for the removal of dead code of the form
>
> if 0:
> ....
>
> to the ast so we use JUMP bytecodes instead (being completed in PR14116). The
> reason is that currently, any syntax error in the block will never be reported.
> For example:

"Any syntax error" is way overstated. "Almost all" syntax errors will
be reported. The ones that won't be aren't "really" about syntax (as
most people view it). but more restrictions on the context in which
certain statements can appear:

> if 0:
> return
>
> if 1:
> pass
> else:
> return
>
> while 0:
> return
>
>
> at module level do not raise any syntax error (just some examples),

Whereas in function scope, those are all fine, with "0" or "1". It's
the "module level" context that matters to those. Regardless of
context, a syntax error that's actually about syntax ;-) is reported:

if 0:
x +


> In https://bugs.python.org/issue37500 it was reported
> that after that, code coverage will decrease as coverage.py sees these
> new bytecodes (even if they are not executed). In general,
> the code object is a bit bigger and the optimization now it
> requires an JUMP instruction to be executed, but syntax errors
> are reported.

And I added other gripes to the report.

I have one file (temp.py) using "if 0:" over 400 times. When I need
to write a small program or example (for, e.g. a StackOverflow
answer), I add a new "if 1:" block at the end, fiddle with it until
it's ready, then change the "1:" to "0:". So the snippets stay around
forever, but are effectively commented out so have no effect
(unless/until they're needed again).

There are, of course, other ways to comment them out, but none so
convenient. For example, under the "if 1:" block, the code is already
indented 4 spaces, so can be pasted exactly as-is into a StackOverflow
answer.

But it doesn't really matter what I happen to do: I'm just one of
millions of users, who have come to rely on this stuff for way over a
decade.

> The discussion on issue 37500 is about if we should prioritize the optimization
> or the correctness of reporting syntax errors. In my opinion,

Which doesn't really make sense unless it's logically _necessary_ to
"pick just one - you can't have both".

> SyntaxErrors should be reported with independence of the value of variables
> (__debug__) or constant as is a property of the code being written not of the
> code being executed. Also, as CPython is the reference implementation of Python,
> the danger here is that it could be interpreted that this optimization is part of the
> language and its behavior should be mirrored in every other Python implementation.

It's been that way for at least 15 years (since Python 2.4, so it's
hard to be worried about that now. Indeed, Armin Rigo posted the
original example, and he wasn't fooled a bit about intent ;-)

> Elsewhere we have always prioritize correctness over speed or optimizations.

When they're irreconcilably in conflict, and the cost of correctness
isn't "just too much", sure.


> I am writing this email to know what other people think. Should we revert the change
> and not report Syntax Errors on optimized blocks? Someone sees a viable way of
> reporting the errors and not emitting the bytecode for these block?

We should revert the change until someone (not me ;-) ) thinks harder
about whether it's possible to get both. There doesn't seem to be,
e.g., a conceptual problem with simply throwing away "if 0:" subtrees
from the AST before generating code, or from snipping the "if 1:"
guard off an "if 1:" block.

You started your msg by saying "we moved the optimization", but that's
not so: the optimization was eliminated. So just finish "moving" it
;-)
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ULCEMMNQQ657LFEK6C5OX3S7ZL4YAV4E/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
On 2019-07-06, Victor Stinner wrote:
> More people seems to expect "if 0: ..." to be removed, than people who
> care of syntax errors on "if 0".

One small data point: I have shipped code that depended on 'if 0'
removing code from the .pyc file. The code inside was not meant to
be released publicly in the case someone inspects the .pyc file.

I could have solved the problem in a different way, e.g. have a tool
that removes all the code inside the 'if'. Having a tool that
toggles 'if DEV_MODE' to 'if 0' was simpler.

I will freely admit that is a bit of a dirty solution. I knew that
Python removes those blocks but I'm not sure that is guaranteed
anywhere. I think it is maybe more important that we give the
syntax errors.

So, I don't care strongly one way or another. However, there is
other code out there that likely depends on the behavior (not just
for code coverage).

Regards,

Neil
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7CHVZU54OM5UTI6VQ6P3EHX3BTUQFP4Y/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
> You started your msg by saying "we moved the optimization", but that's
not so: the optimization was eliminated. So just finish "moving" it
;-)

PR14116 was "finishing" the moving by adding JUMPS (a weaker optimization
but still an optimization).

> We should revert the change until someone (not me ;-) ) thinks harder
about whether it's possible to get both. There doesn't seem to be,
e.g., a conceptual problem with simply throwing away "if 0:" subtrees
from the AST before generating code, or from snipping the "if 1:"
guard off an "if 1:" block.

Ok, these were very good points. I am convinced. I agree that the better
thing to do at this pointis to revert this until we can think this a bit
more :)

I have made a PR to revert the change.

On Fri, 5 Jul 2019 at 23:27, Tim Peters <tim.peters@gmail.com> wrote:

> [Pablo Galindo Salgado <pablogsal@gmail.com>]
> > Recently, we moved the optimization for the removal of dead code of the
> form
> >
> > if 0:
> > ....
> >
> > to the ast so we use JUMP bytecodes instead (being completed in
> PR14116). The
> > reason is that currently, any syntax error in the block will never be
> reported.
> > For example:
>
> "Any syntax error" is way overstated. "Almost all" syntax errors will
> be reported. The ones that won't be aren't "really" about syntax (as
> most people view it). but more restrictions on the context in which
> certain statements can appear:
>
> > if 0:
> > return
> >
> > if 1:
> > pass
> > else:
> > return
> >
> > while 0:
> > return
> >
> >
> > at module level do not raise any syntax error (just some examples),
>
> Whereas in function scope, those are all fine, with "0" or "1". It's
> the "module level" context that matters to those. Regardless of
> context, a syntax error that's actually about syntax ;-) is reported:
>
> if 0:
> x +
>
>
> > In https://bugs.python.org/issue37500 it was reported
> > that after that, code coverage will decrease as coverage.py sees these
> > new bytecodes (even if they are not executed). In general,
> > the code object is a bit bigger and the optimization now it
> > requires an JUMP instruction to be executed, but syntax errors
> > are reported.
>
> And I added other gripes to the report.
>
> I have one file (temp.py) using "if 0:" over 400 times. When I need
> to write a small program or example (for, e.g. a StackOverflow
> answer), I add a new "if 1:" block at the end, fiddle with it until
> it's ready, then change the "1:" to "0:". So the snippets stay around
> forever, but are effectively commented out so have no effect
> (unless/until they're needed again).
>
> There are, of course, other ways to comment them out, but none so
> convenient. For example, under the "if 1:" block, the code is already
> indented 4 spaces, so can be pasted exactly as-is into a StackOverflow
> answer.
>
> But it doesn't really matter what I happen to do: I'm just one of
> millions of users, who have come to rely on this stuff for way over a
> decade.
>
> > The discussion on issue 37500 is about if we should prioritize the
> optimization
> > or the correctness of reporting syntax errors. In my opinion,
>
> Which doesn't really make sense unless it's logically _necessary_ to
> "pick just one - you can't have both".
>
> > SyntaxErrors should be reported with independence of the value of
> variables
> > (__debug__) or constant as is a property of the code being written not
> of the
> > code being executed. Also, as CPython is the reference implementation of
> Python,
> > the danger here is that it could be interpreted that this optimization
> is part of the
> > language and its behavior should be mirrored in every other Python
> implementation.
>
> It's been that way for at least 15 years (since Python 2.4, so it's
> hard to be worried about that now. Indeed, Armin Rigo posted the
> original example, and he wasn't fooled a bit about intent ;-)
>
> > Elsewhere we have always prioritize correctness over speed or
> optimizations.
>
> When they're irreconcilably in conflict, and the cost of correctness
> isn't "just too much", sure.
>
>
> > I am writing this email to know what other people think. Should we
> revert the change
> > and not report Syntax Errors on optimized blocks? Someone sees a viable
> way of
> > reporting the errors and not emitting the bytecode for these block?
>
> We should revert the change until someone (not me ;-) ) thinks harder
> about whether it's possible to get both. There doesn't seem to be,
> e.g., a conceptual problem with simply throwing away "if 0:" subtrees
> from the AST before generating code, or from snipping the "if 1:"
> guard off an "if 1:" block.
>
> You started your msg by saying "we moved the optimization", but that's
> not so: the optimization was eliminated. So just finish "moving" it
> ;-)
>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
I can understand the desire for correctness.  I do have to wonder
though: has anyone *other* than language implementors noticed this issue
with SyntaxErrors not being reported?

Perhaps we should remember "Practicality beats purity"?

--Ned.

On 7/5/19 6:14 PM, Pablo Galindo Salgado wrote:
> > Until a solution which makes everyone happy can be found, I suggest to
> move back to the status quo: revert the change.
>
> >More people seems to expect "if 0: ..." to be removed, than people who
> care of syntax errors on "if 0".
>
> I don't think this is that clear. As Paul wrote on the issue this is
> the  result of fixing a bug that has been open since
> 2008 (11 years), which itself was noticed independently, also in 2008
> (#1875 and #1920, respectively).
> He also independently discovered the same issue last year when writing
> some tests for IPython.
>
> https://bugs.python.org/msg347394
>
> On Fri, 5 Jul 2019 at 23:10, Victor Stinner <vstinner@redhat.com
> <mailto:vstinner@redhat.com>> wrote:
>
> Hi,
>
> Until a solution which makes everyone happy can be found, I suggest to
> move back to the status quo: revert the change.
>
> More people seems to expect "if 0: ..." to be removed, than people who
> care of syntax errors on "if 0".
>
> --
>
> Would it be possible to detect if the "if 0" block would raise a
> syntax error, and only remove it if it doesn't raise a syntax error?
>
> That's the approach I chose in my fatoptimizer project which is
> implemented as an AST optimizer:
> https://github.com/vstinner/fatoptimizer/blob/master/fatoptimizer/dead_code.py
>
> See the tests to see which cases are *not* optimized:
> https://github.com/vstinner/fatoptimizer/blob/master/test_fatoptimizer.py#L2428
>
> Some examples (the "dead code elimitaiton" is not only about "if 0",
> but also "while 0", dead code after return, etc.):
>
>     self.check_dont_optimize("""
>         def func():
>             if 0:
>                 yield
>     """)
>
>     self.check_dont_optimize("while 1: x = 1")
>
>     self.check_dont_optimize("""
>         def func(obj):
>             return
>             if 0:
>                 yield from obj
>     """)
>
>     self.check_dont_optimize("""
>         try:
>             pass
>         except Exception:
>             yield 3
>     """)
>
> See also the doc:
> https://fatoptimizer.readthedocs.io/en/latest/optimizations.html#dead-code
>
> --
>
> About code coverage, it seems like -X noopt would help:
> https://github.com/python/cpython/pull/13600
>
> But I'm not sure anymore after Ned Batchelder wrote:
>
> "The real-word implications from my world are this: if your code has
> "if 0:" clauses in it, and you measure its coverage, then because the
> lines have not been optimized away, coverage.py will think the lines
> are a possible execution path, and will be considered a miss because
> they are not executed.  This will reduce your coverage percentage."
> https://bugs.python.org/issue37500#msg347362
>
> Does it mean that coverage.py will report even more "false positive"
> using -X noopt?
>
> Victor
>
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LWZPZQ2IL67DPX3RC342TTOVCLKDSBTJ/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
I think this summarizes the situation very well :)

https://xkcd.com/1172/

On Fri, 5 Jul 2019 at 22:28, Pablo Galindo Salgado <pablogsal@gmail.com>
wrote:

> Hi,
>
> Recently, we moved the optimization for the removal of dead code of the
> form
>
> if 0:
> ....
>
> to the ast so we use JUMP bytecodes instead (being completed in PR14116).
> The reason is that
> currently, any syntax error in the block will never be reported. For
> example:
>
> if 0:
> return
>
> if 1:
> pass
> else:
> return
>
> while 0:
> return
>
>
> at module level do not raise any syntax error (just some examples), In
> https://bugs.python.org/issue37500 it was reported
> that after that, code coverage will decrease as coverage.py sees these new
> bytecodes (even if they are not executed). In general,
> the code object is a bit bigger and the optimization now it requires an
> JUMP instruction to be executed, but syntax errors are reported.
>
> The discussion on issue 37500 is about if we should prioritize the
> optimization or the correctness of reporting syntax errors. In my opinion,
> SyntaxErrors should be reported with independence of the value of
> variables (__debug__) or constant as is a property of the code being written
> not of the code being executed. Also, as CPython is the reference
> implementation of Python, the danger here is that it could be interpreted
> that
> this optimization is part of the language and its behavior should be
> mirrored in every other Python implementation. Elsewhere we have always
> prioritize correctness over speed or optimizations.
>
> I am writing this email to know what other people think. Should we revert
> the change and not report Syntax Errors on optimized blocks? Someone
> sees a viable way of reporting the errors and not emitting the bytecode
> for these block?
>
> Regards,
> Pablo
>
>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
That's a bit of a fine line to walk. I noticed it when writing tests for IPython, which is not a implementation of Python, but is dealing with the nitty gritty details and manipulating the syntax tree it's true, but it's roughly the same class of project as implementing coverage.py, so if we disqualify all the people who notice these bugs because they are working on abstruse meta-code manipulation, I think both sides will come up empty.

On July 5, 2019 10:39:27 PM UTC, Ned Batchelder <ned@nedbatchelder.com> wrote:
>I can understand the desire for correctness.  I do have to wonder
>though: has anyone *other* than language implementors noticed this
>issue
>with SyntaxErrors not being reported?
>
>Perhaps we should remember "Practicality beats purity"?
>
>--Ned.
>
>On 7/5/19 6:14 PM, Pablo Galindo Salgado wrote:
>> > Until a solution which makes everyone happy can be found, I suggest
>to
>> move back to the status quo: revert the change.
>>
>> >More people seems to expect "if 0: ..." to be removed, than people
>who
>> care of syntax errors on "if 0".
>>
>> I don't think this is that clear. As Paul wrote on the issue this is
>> the  result of fixing a bug that has been open since
>> 2008 (11 years), which itself was noticed independently, also in 2008
>
>> (#1875 and #1920, respectively).
>> He also independently discovered the same issue last year when
>writing
>> some tests for IPython.
>>
>> https://bugs.python.org/msg347394
>>
>> On Fri, 5 Jul 2019 at 23:10, Victor Stinner <vstinner@redhat.com
>> <mailto:vstinner@redhat.com>> wrote:
>>
>> Hi,
>>
>> Until a solution which makes everyone happy can be found, I
>suggest to
>> move back to the status quo: revert the change.
>>
>> More people seems to expect "if 0: ..." to be removed, than
>people who
>> care of syntax errors on "if 0".
>>
>> --
>>
>> Would it be possible to detect if the "if 0" block would raise a
>> syntax error, and only remove it if it doesn't raise a syntax
>error?
>>
>> That's the approach I chose in my fatoptimizer project which is
>> implemented as an AST optimizer:
>>
>https://github.com/vstinner/fatoptimizer/blob/master/fatoptimizer/dead_code.py
>>
>> See the tests to see which cases are *not* optimized:
>>
>https://github.com/vstinner/fatoptimizer/blob/master/test_fatoptimizer.py#L2428
>>
>> Some examples (the "dead code elimitaiton" is not only about "if
>0",
>> but also "while 0", dead code after return, etc.):
>>
>>     self.check_dont_optimize("""
>>         def func():
>>             if 0:
>>                 yield
>>     """)
>>
>>     self.check_dont_optimize("while 1: x = 1")
>>
>>     self.check_dont_optimize("""
>>         def func(obj):
>>             return
>>             if 0:
>>                 yield from obj
>>     """)
>>
>>     self.check_dont_optimize("""
>>         try:
>>             pass
>>         except Exception:
>>             yield 3
>>     """)
>>
>> See also the doc:
>>
>https://fatoptimizer.readthedocs.io/en/latest/optimizations.html#dead-code
>>
>> --
>>
>> About code coverage, it seems like -X noopt would help:
>> https://github.com/python/cpython/pull/13600
>>
>> But I'm not sure anymore after Ned Batchelder wrote:
>>
>> "The real-word implications from my world are this: if your code
>has
>> "if 0:" clauses in it, and you measure its coverage, then because
>the
>> lines have not been optimized away, coverage.py will think the
>lines
>> are a possible execution path, and will be considered a miss
>because
>> they are not executed.  This will reduce your coverage
>percentage."
>> https://bugs.python.org/issue37500#msg347362
>>
>> Does it mean that coverage.py will report even more "false
>positive"
>> using -X noopt?
>>
>> Victor
>>
>>
>> _______________________________________________
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-leave@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>https://mail.python.org/archives/list/python-dev@python.org/message/LWZPZQ2IL67DPX3RC342TTOVCLKDSBTJ/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
Ok, I think I found a way to make everyone happy. I have updated the issue
and the PR with the solution.

Basically the idea is to add an attribute to the compiler that is checked
when emitting the bytecode
and if is false, we stop don't emit anything but we do check for errors.
Unless I am fundamentally wrong
(which is possible as is late here) is very simple and covers all the
cases.

It would be great if someone could review the PR to check if I missed
anything with this argument.

On Fri, 5 Jul 2019 at 22:28, Pablo Galindo Salgado <pablogsal@gmail.com>
wrote:

> Hi,
>
> Recently, we moved the optimization for the removal of dead code of the
> form
>
> if 0:
> ....
>
> to the ast so we use JUMP bytecodes instead (being completed in PR14116).
> The reason is that
> currently, any syntax error in the block will never be reported. For
> example:
>
> if 0:
> return
>
> if 1:
> pass
> else:
> return
>
> while 0:
> return
>
>
> at module level do not raise any syntax error (just some examples), In
> https://bugs.python.org/issue37500 it was reported
> that after that, code coverage will decrease as coverage.py sees these new
> bytecodes (even if they are not executed). In general,
> the code object is a bit bigger and the optimization now it requires an
> JUMP instruction to be executed, but syntax errors are reported.
>
> The discussion on issue 37500 is about if we should prioritize the
> optimization or the correctness of reporting syntax errors. In my opinion,
> SyntaxErrors should be reported with independence of the value of
> variables (__debug__) or constant as is a property of the code being written
> not of the code being executed. Also, as CPython is the reference
> implementation of Python, the danger here is that it could be interpreted
> that
> this optimization is part of the language and its behavior should be
> mirrored in every other Python implementation. Elsewhere we have always
> prioritize correctness over speed or optimizations.
>
> I am writing this email to know what other people think. Should we revert
> the change and not report Syntax Errors on optimized blocks? Someone
> sees a viable way of reporting the errors and not emitting the bytecode
> for these block?
>
> Regards,
> Pablo
>
>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
This thread is based on https://bugs.python.org/issue37500
to get the thoughts of a broader range of people.

On 7/5/2019 5:28 PM, Pablo Galindo Salgado wrote:

> Recently, we moved the optimization for the removal of dead code of the form

> if 0:
>   ....

> to the ast

My impression is that you stopped at least the (complete) removal of
such code.

An example use case for "if 0:"
if 0:
<code for algorithm version A>
else:
<code for algorithm version B>

It is easy to switch between branches. Both branches should pass and be
covered by tests, but this is impossible when using a constant, while
using a constant results in leaner, cleaner .pyc.
(I combined here paraphrases of Tim Peters' msg347383 concerns and Net
Batchhelder's msg347296 coverage concerns.)

> so we use JUMP bytecodes instead (being completed in

Are you leaving the code and just jumping over it?

> PR14116). The reason is that
> currently, any syntax error in the block will never be reported. For
> example:
>
> if 0:
>   return
>
> if 1:
> pass
> else:
> return
>
> while 0:
> return
>
>
> at module level do not raise any syntax error (just some examples), In
> https://bugs.python.org/issue37500 it was reported
> that after that, code coverage will decrease as coverage.py sees these
> new bytecodes (even if they are not executed). In general,
> the code object is a bit bigger and the optimization now it requires an
> JUMP instruction to be executed, but syntax errors are reported.
>
> The discussion on issue 37500 is about if we should prioritize the
> optimization or the correctness of reporting syntax errors.

This is only an issue if we cannot do both.

> In my opinion,
> SyntaxErrors should be reported with independence of the value of
> variables (__debug__)

__debug__ is defined as a boolean constant, set on startup

> or constant as is a property of the code being written
> not of the code being executed. Also, as CPython is the reference
> implementation of Python, the danger here is that it could be
> interpreted that
> this optimization is part of the language and its behavior should be

> mirrored in every other Python implementation. Elsewhere we have always
> prioritize correctness over speed or optimizations.

I am not sure I see your point here. The python language documentation
for __debug__ and assert make no mention of CPython.
https://docs.python.org/3/library/constants.html#__debug__
https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement

As I said on the issue, it is not unreasonable to me to infer from the
above that skipping 'if 0' clauses is at least close to being a language
feature, and certainly not a minor CPython implementation detail to be
casually changed.

https://docs.python.org/3/using/cmdline.html does say
"Other implementations’ command line schemes may differ."
but you are patching CPython, not some other implementation.

https://docs.python.org/3/using/cmdline.html#cmdoption-o
says -O removes "assert statements and any code conditional on the value
of __debug__". Your initial patch appears to violate this. As for
other implementations, I would think that any implementation with -O or
the claimed equivalent should do the same.

> I am writing this email to know what other people think. Should we
> revert the change and not report Syntax Errors on optimized blocks? Someone
> sees a viable way of reporting the errors and not emitting the bytecode
> for these block?

I presume that most of us agree that the last option, doing both, would
be best, or at least agreeable. So I prefer that this be explored more
before we fight too hard over a binary choice than might not be
necessary. Right now, I would say that the 15-year status quo, bytecode
deletion, should remain for 3.8 and that the Steering Council should
ultimately decide for 3.9 should a decision be necessary.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HDGHKDECPO5IUNKGSUGHF7YJBN25ACKN/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
Thanks Terry for your comments!

> I presume that most of us agree that the last option, doing both, would be best, or at least agreeable.

I created a PR reverting the change but I (may) have found a way of doing both things :)
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FZKYJZXVSIDJPB5EIAC2KKVE42JXG55P/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
Hopefully Pablo's proposed solution works. If it doesn't, could this one optimization be left in the peephole optimizer at bytecode level? Otherwise is another solution to follow through with https://discuss.python.org/t/switch-pythons-parsing-tech-to-something-more-powerful-than-ll-1/379 and switch the parser so it can handle all syntax errors on its own without support from the AST analyzer?
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LN236JC6M3IGPC6GT43TCEOJRDNLO53T/
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
On Mon, Jul 8, 2019 at 1:51 PM Brett Cannon <brett@python.org> wrote:

> Hopefully Pablo's proposed solution works.


I'm sure it will.


> If it doesn't, could this one optimization be left in the peephole
> optimizer at bytecode level? Otherwise is another solution to follow
> through with
> https://discuss.python.org/t/switch-pythons-parsing-tech-to-something-more-powerful-than-ll-1/379
> and switch the parser so it can handle all syntax errors on its own without
> support from the AST analyzer?
>

Thinking about this, that's possible, but it would require bloating the
grammar with variants that allow continue/break or not, allow return/yield
or not, allow await or not. So I think this particular thing is still best
handled by a separate check. (The PEG-based parser I am contemplating would
be able to tell an expression statement from an assignment statement
without a separate pass to make sure you don't try to assign to a call, and
it would allow a much more elegant approach to keyword arguments and the
walrus operator.)

--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Re: Removing dead bytecode vs reporting syntax errors [ In reply to ]
>. If it doesn't, could this one optimization
be left in the peephole optimizer at bytecode level?

Sadly no, because at that time there is not enough information left to do things correctly. The problem manifest with constructs like

if something or __debug__:
...

You cannot just look at the bytecode before the POP_JUMP_IF_FALSE (or similar) because you don't know reliably which bytecodes correspond to the entire condition (at least that I know of).

My proposal (it seem that works, but is being reviewed still) is basically doing a compiler pass that just check for errors over the blocks that we don't want to generate bytecode instead of not visiting them at all. I think is the less intrusive and faster solution that I can think off.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/22SLXBQI6RSVUSCKW25YGQFESD5OCEBV/