Mailing List Archive

PEP 420 - dynamic path computation is missing rationale
I have just reviewed PEP 420 (namespace packages) and sent Eric my
detailed feedback; most of it is minor or requesting for examples and
I'm sure he'll fix it to my satisfaction.

Generally speaking the PEP is a beacon if clarity. But I stumbled
about one feature that bothers me in its specification and through its
lack of rationale. This is the section on Dynamic Path Computation:
(http://www.python.org/dev/peps/pep-0420/#dynamic-path-computation).
The specification bothers me because it requires in-place modification
of sys.path. Does this mean sys.path is no longer a plain list? I'm
sure it's going to break things left and right (or at least things
will be violating this requirement left and right); there has never
been a similar requirement (unlike, e.g., sys.modules, which is
relatively well-known for being cached in a C-level global variable).
Worse, this apparently affects __path__ variables of namespace
packages as well, which are now specified as an unspecified read-only
iterable. (I can only guess that there is a connection between these
two features -- the PEP doesn't mention one.) Again, I would be much
happier with just a list.

While I can imagine there being a use case for recomputing the various
paths, I am much less sure that it is worth attempting to specify that
this will happen *automatically* when sys.path is modified in a
certain way. I'd be much happier if these constraints were struck and
the recomputation had to be requested explicitly by calling some new
function in sys.

>From my POV, this is the only show-stopper for acceptance of PEP 420.
(That is, either a rock-solid rationale should be supplied, or the
constraints should be removed.)

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 5/20/2012 9:33 PM, Guido van Rossum wrote:
> Generally speaking the PEP is a beacon if clarity. But I stumbled
> about one feature that bothers me in its specification and through its
> lack of rationale. This is the section on Dynamic Path Computation:
> (http://www.python.org/dev/peps/pep-0420/#dynamic-path-computation).
> The specification bothers me because it requires in-place modification
> of sys.path. Does this mean sys.path is no longer a plain list? I'm
> sure it's going to break things left and right (or at least things
> will be violating this requirement left and right); there has never
> been a similar requirement (unlike, e.g., sys.modules, which is
> relatively well-known for being cached in a C-level global variable).
> Worse, this apparently affects __path__ variables of namespace
> packages as well, which are now specified as an unspecified read-only
> iterable. (I can only guess that there is a connection between these
> two features -- the PEP doesn't mention one.) Again, I would be much
> happier with just a list.

sys.path would still be a plain list. It's the namespace package's
__path__ that would be a special object. Every time __path__ is accessed
it checks to see if it's parent path has been modified. The parent path
for top level modules is sys.path. The __path__ object detects
modification by keeping a local copy of the parent, plus a reference to
the parent, and compares them.

> While I can imagine there being a use case for recomputing the various
> paths, I am much less sure that it is worth attempting to specify that
> this will happen *automatically* when sys.path is modified in a
> certain way. I'd be much happier if these constraints were struck and
> the recomputation had to be requested explicitly by calling some new
> function in sys.
>
>>From my POV, this is the only show-stopper for acceptance of PEP 420.
> (That is, either a rock-solid rationale should be supplied, or the
> constraints should be removed.)

I don't have a preference on whether the feature stays or goes, so I'll
let PJE give the use case. I've copied him here in case he doesn't read
python-dev.

Now that I think about it some more, the motivation is probably to ease
the migration from setuptools, which does provide this feature.

Eric.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Mon, May 21, 2012 at 1:00 AM, Eric V. Smith <eric@trueblade.com> wrote:
> On 5/20/2012 9:33 PM, Guido van Rossum wrote:
>> Generally speaking the PEP is a beacon if clarity. But I stumbled
>> about one feature that bothers me in its specification and through its
>> lack of rationale. This is the section on Dynamic Path Computation:
>> (http://www.python.org/dev/peps/pep-0420/#dynamic-path-computation).
>> The specification bothers me because it requires in-place modification
>> of sys.path. Does this mean sys.path is no longer a plain list? I'm
>> sure it's going to break things left and right (or at least things
>> will be violating this requirement left and right); there has never
>> been a similar requirement (unlike, e.g., sys.modules, which is
>> relatively well-known for being cached in a C-level global variable).
>> Worse, this apparently affects __path__ variables of namespace
>> packages as well, which are now specified as an unspecified read-only
>> iterable. (I can only guess that there is a connection between these
>> two features -- the PEP doesn't mention one.) Again, I would be much
>> happier with just a list.
>
> sys.path would still be a plain list. It's the namespace package's
> __path__ that would be a special object. Every time __path__ is accessed
> it checks to see if it's parent path has been modified. The parent path
> for top level modules is sys.path. The __path__ object detects
> modification by keeping a local copy of the parent, plus a reference to
> the parent, and compares them.

Ah, I see. But I disagree that this is a reasonable constraint on
sys.path. The magic __path__ object of a toplevel namespace module
should know it is a toplevel module, and explicitly refetch sys.path
rather than just keeping around a copy.

This leaves the magic __path__ objects for namespace modules, which I
could live with, as long as their repr was not the same as a list, and
assuming a good rationale is given. Although I'd still prefer plain
lists here as well; I'd like to be able to manually construct a
namespace package and force its directories to be a specific set of
directories that I happen to know about, regardless of whether they
are related to sys.path or not. And I'd like to know that my setup in
that case would not be disturbed by changes to sys.path.

>> While I can imagine there being a use case for recomputing the various
>> paths, I am much less sure that it is worth attempting to specify that
>> this will happen *automatically* when sys.path is modified in a
>> certain way. I'd be much happier if these constraints were struck and
>> the recomputation had to be requested explicitly by calling some new
>> function in sys.
>>
>>>From my POV, this is the only show-stopper for acceptance of PEP 420.
>> (That is, either a rock-solid rationale should be supplied, or the
>> constraints should be removed.)
>
> I don't have a preference on whether the feature stays or goes, so I'll
> let PJE give the use case. I've copied him here in case he doesn't read
> python-dev.
>
> Now that I think about it some more, the motivation is probably to ease
> the migration from setuptools, which does provide this feature.

I'd like to hear more about this from Philip -- is that feature
actually widely used? What would a package have to do if the feature
didn't exist? I'd really much rather not have this feature, which
reeks of too much magic to me. (An area where Philip and I often
disagree. :-)

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Mon, May 21, 2012 at 9:55 AM, Guido van Rossum <guido@python.org> wrote:

> Ah, I see. But I disagree that this is a reasonable constraint on
> sys.path. The magic __path__ object of a toplevel namespace module
> should know it is a toplevel module, and explicitly refetch sys.path
> rather than just keeping around a copy.
>

That's fine by me - the class could actually be defined to take a module
name and attribute (e.g. 'sys', 'path' or 'foo', '__path__'), and then
there'd be no need to special case anything: it would behave exactly the
same way for subpackages and top-level packages.


> This leaves the magic __path__ objects for namespace modules, which I
> could live with, as long as their repr was not the same as a list, and
> assuming a good rationale is given. Although I'd still prefer plain
> lists here as well; I'd like to be able to manually construct a
> namespace package and force its directories to be a specific set of
> directories that I happen to know about, regardless of whether they
> are related to sys.path or not. And I'd like to know that my setup in
> that case would not be disturbed by changes to sys.path.
>

To do that, you just assign to __path__, the same as now, ala __path__ =
pkgutil.extend_path(). The auto-updating is in the initially-assigned
__path__ object, not the module object or some sort of generalized magic.


I'd like to hear more about this from Philip -- is that feature
> actually widely used?


Well, it's built into setuptools, so yes. ;-) It gets used any time a
dynamically specified dependency is used that might contain a namespace
package. This means, for example, that every setup script out there using
"setup.py test", every project using certain paste.deploy features... it's
really difficult to spell out the scope of things that are using this, in
the context of setuptools and distribute, because there are an immense
number of ways to indirectly rely on it.

This doesn't mean that the feature can't continue to be implemented inside
setuptools' dynamic dependency system, but the code to do it in setuptools
is MUCH more complicated than the PEP 420 code, and doesn't work if you
manually add something to sys.path without asking setuptools to do it.
It's also somewhat timing-sensitive, depending on when and whether you
import 'site' and pkg_resources, and whether you are mixing eggs and
non-eggs in your namespace packages.

In short, the implementation is a huge mess that the PEP 420 approach would
vastly simplify.

But... that wasn't the original reason why I proposed it. The original
reason was simply that it makes namespace packages act more like the
equivalents do in other languages. While being able to override __path__
can be considered a feature of Python, its being static by default is NOT a
feature, in the same way that *requiring* an __init__.py is not really a
feature.

The principle of least surprise says (at least IMO) that if you add a
directory to sys.path, you should be able to import stuff from it. That
whether it works depends on whether or not you already imported part of a
namespace package earlier is both surprising and confusing. (More on this
below.)



> What would a package have to do if the feature didn't exist?


Continue to depend on setuptools to do it for them, or use some
hypothetical update API... but that's not really the right question. ;-)

The right question is, what happens to package *users* if the feature
didn't exist?

And the answer to that question is, "you must call this hypothetical update
API *every time* you change sys.path, because otherwise your imports might
break, depending on whether or not some other package imported something
from a namespace before you changed sys.path".

And of course, you also need to make sure that any third-party code you use
does this too, if it adds something to sys.path for you.

And if you're writing cross-Python-version code, you need to check to make
sure whether the API is actually available.

And if you're someone helping Python newbies, you need to add this to your
list of debugging questions for import-related problems.

And remember: if you forget to do this, it might not break now. It'll
break later, when you add that other plugin or update that random module
that dynamically decides to import something that just happens to be in a
namespace package, so be prepared for it to break your application in the
field, when an end-user is using it with a collection of plugins that you
haven't tested together, or in the same import sequence...

The people using setuptools won't have these problems, but *new* Python
users will, as people begin using a PEP 420 that lacks this feature.

The key scope question, I think, is: "How often do programs change sys.path
at runtime, and what have they imported up to that point?" (Because for
the other part of the scope, I think it's a fairly safe bet that namespace
packages are going to become even *more* popular than they are now, once
PEP 420 is in place.)

But the key API/usability question is: "What's the One Obvious Way to
add/change what's importable?"

And I believe the answer to that question is, "change sys.path", not
"change sys.path, and then import some other module to call another API to
say, 'yes, I really *meant* to update sys.path, thank you very much.'"

(Especially since NOT requiring that extra API isn't going to break any
existing code.)



> I'd really much rather not have this feature, which
> reeks of too much magic to me. (An area where Philip and I often
> disagree. :-)
>

My take on it is that it only SEEMS like magic, because we're used to
static __path__. But other languages don't have per-package __path__ in
the first place, so there's nothing to "automatically update", and so it's
not magic at all that other subpackages/modules can be found when the
system path changes!

So, under the PEP 420 approach, it's *static* __path__ that's really the
weird special case, and should be considered so. (After all, __path__ is
and was primarily an implementation optimization and compatibility hack,
rather than a user-facing "feature" of the import system.)

For example, when *would* you want to explicitly spell out a namespace
package __path__, and restrict it from seeing sys.path changes? I've not
seen *anybody* ask for this feature in the context of setuptools; it's only
ever been bug reports about when the more complicated implementation fails
to detect an update.

So, to wrap up:

* The primary rationale for the feature is that "least surprise" for a new
user to Python is that adding to sys.path should allow importing a portion
of a namespace, whether or not you've already imported some other thing in
that namespace. Symmetry with other languages and with other Python
features (e.g. changing the working directory in an interactive
interpreter) suggests it, and the removal of a similar timing dependency
from PEP 402 (preventing direct import of a namespace-only package unless
you imported a subpackage first) suggests that the same type of timing
dependency should be removed here, too. (Note, for example, that I may not
know that importing baz.spam indirectly causes some part of foo.wiz to be
imported, and that if I then add another directory to sys.path containing a
foo.* portion, my code will *no longer work* when I try to import foo.ham.
This is much more "magical" behavior, in least-surprise terms!)

* The constraints on sys.path and package __path__ objects can and should
be removed, by making the dynamic path objects refer to a module and
attribute, instead of directly referencing parent __path__ objects. Code
that currently manipulates __path__ will not break, because such code will
not be using PEP 420 namespace packages anyway (and so, __path__ will be a
list. (Even so, the most common __path__ manipulation idiom is "__path__ =
pkgutil.extend_path(...)" anyway!)

* Namespace packages are a widely used feature of setuptools, and AFAIK
nobody has *ever* asked to stop dynamic additions to namespace __path__,
but a wide assortment of things people do with setuptools rely on dynamic
additions under the hood. Providing the feature in PEP 420 gives a
migration path away from setuptools, at least for this one feature.
(Specifically, it does away with the need to use declare_namespace(), and
the need to do all sys.path manipulation via setuptools' requirements API.)

* Self-contained (__init__.py packages) and fixed __path__ lists can and
should be considered the "magic" or "special case" parts of importing in
Python 3, even though we're accustomed to them being central import
concepts in Python 2. Modules and namespace packages can and should be the
default case from an instructional POV, and sys.path updating should
reflect this. (That is, future tutorials should introduce modules, then
namespace packages, and finally self-contained packages with __init__ and
__path__, because the *idea* of a namespace package doesn't depend on
__path__ existing in the first place; it's essentially only a historical
accident that self-contained packages were implemented in Python first.)
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
As a simple example to back up PJE's explanation, consider:
1. encodings becomes a namespace package
2. It sometimes gets imported during interpreter startup to initialise the
standard io streams
3. An application modifies sys.path after startup and wants to contribute
additional encodings

Searching the entire parent path for new portions on every import would be
needlessly slow.

Not recognising new portions would be needlessly confusing for users. In
our simple case above, the application would fail if the io initialisation
accessed the encodings package, but work if it did not (e.g. when all
streams are utf-8).

PEP 420 splits the difference via an automatically invalidated cache: when
you iterate over a namespace package __path__ object, it rescans the parent
path for new portions *if and only if* the contents of the parent path have
changed since the previous scan.

Cheers,
Nick.

--
Sent from my phone, thus the relative brevity :)
On May 22, 2012 4:10 AM, "PJ Eby" <pje@telecommunity.com> wrote:

> On Mon, May 21, 2012 at 9:55 AM, Guido van Rossum <guido@python.org>wrote:
>
>> Ah, I see. But I disagree that this is a reasonable constraint on
>> sys.path. The magic __path__ object of a toplevel namespace module
>> should know it is a toplevel module, and explicitly refetch sys.path
>> rather than just keeping around a copy.
>>
>
> That's fine by me - the class could actually be defined to take a module
> name and attribute (e.g. 'sys', 'path' or 'foo', '__path__'), and then
> there'd be no need to special case anything: it would behave exactly the
> same way for subpackages and top-level packages.
>
>
>> This leaves the magic __path__ objects for namespace modules, which I
>> could live with, as long as their repr was not the same as a list, and
>> assuming a good rationale is given. Although I'd still prefer plain
>> lists here as well; I'd like to be able to manually construct a
>> namespace package and force its directories to be a specific set of
>> directories that I happen to know about, regardless of whether they
>> are related to sys.path or not. And I'd like to know that my setup in
>> that case would not be disturbed by changes to sys.path.
>>
>
> To do that, you just assign to __path__, the same as now, ala __path__ =
> pkgutil.extend_path(). The auto-updating is in the initially-assigned
> __path__ object, not the module object or some sort of generalized magic.
>
>
> I'd like to hear more about this from Philip -- is that feature
>> actually widely used?
>
>
> Well, it's built into setuptools, so yes. ;-) It gets used any time a
> dynamically specified dependency is used that might contain a namespace
> package. This means, for example, that every setup script out there using
> "setup.py test", every project using certain paste.deploy features... it's
> really difficult to spell out the scope of things that are using this, in
> the context of setuptools and distribute, because there are an immense
> number of ways to indirectly rely on it.
>
> This doesn't mean that the feature can't continue to be implemented inside
> setuptools' dynamic dependency system, but the code to do it in setuptools
> is MUCH more complicated than the PEP 420 code, and doesn't work if you
> manually add something to sys.path without asking setuptools to do it.
> It's also somewhat timing-sensitive, depending on when and whether you
> import 'site' and pkg_resources, and whether you are mixing eggs and
> non-eggs in your namespace packages.
>
> In short, the implementation is a huge mess that the PEP 420 approach
> would vastly simplify.
>
> But... that wasn't the original reason why I proposed it. The original
> reason was simply that it makes namespace packages act more like the
> equivalents do in other languages. While being able to override __path__
> can be considered a feature of Python, its being static by default is NOT a
> feature, in the same way that *requiring* an __init__.py is not really a
> feature.
>
> The principle of least surprise says (at least IMO) that if you add a
> directory to sys.path, you should be able to import stuff from it. That
> whether it works depends on whether or not you already imported part of a
> namespace package earlier is both surprising and confusing. (More on this
> below.)
>
>
>
>> What would a package have to do if the feature didn't exist?
>
>
> Continue to depend on setuptools to do it for them, or use some
> hypothetical update API... but that's not really the right question. ;-)
>
> The right question is, what happens to package *users* if the feature
> didn't exist?
>
> And the answer to that question is, "you must call this hypothetical
> update API *every time* you change sys.path, because otherwise your imports
> might break, depending on whether or not some other package imported
> something from a namespace before you changed sys.path".
>
> And of course, you also need to make sure that any third-party code you
> use does this too, if it adds something to sys.path for you.
>
> And if you're writing cross-Python-version code, you need to check to make
> sure whether the API is actually available.
>
> And if you're someone helping Python newbies, you need to add this to your
> list of debugging questions for import-related problems.
>
> And remember: if you forget to do this, it might not break now. It'll
> break later, when you add that other plugin or update that random module
> that dynamically decides to import something that just happens to be in a
> namespace package, so be prepared for it to break your application in the
> field, when an end-user is using it with a collection of plugins that you
> haven't tested together, or in the same import sequence...
>
> The people using setuptools won't have these problems, but *new* Python
> users will, as people begin using a PEP 420 that lacks this feature.
>
> The key scope question, I think, is: "How often do programs change
> sys.path at runtime, and what have they imported up to that point?"
> (Because for the other part of the scope, I think it's a fairly safe bet
> that namespace packages are going to become even *more* popular than they
> are now, once PEP 420 is in place.)
>
> But the key API/usability question is: "What's the One Obvious Way to
> add/change what's importable?"
>
> And I believe the answer to that question is, "change sys.path", not
> "change sys.path, and then import some other module to call another API to
> say, 'yes, I really *meant* to update sys.path, thank you very much.'"
>
> (Especially since NOT requiring that extra API isn't going to break any
> existing code.)
>
>
>
>> I'd really much rather not have this feature, which
>> reeks of too much magic to me. (An area where Philip and I often
>> disagree. :-)
>>
>
> My take on it is that it only SEEMS like magic, because we're used to
> static __path__. But other languages don't have per-package __path__ in
> the first place, so there's nothing to "automatically update", and so it's
> not magic at all that other subpackages/modules can be found when the
> system path changes!
>
> So, under the PEP 420 approach, it's *static* __path__ that's really the
> weird special case, and should be considered so. (After all, __path__ is
> and was primarily an implementation optimization and compatibility hack,
> rather than a user-facing "feature" of the import system.)
>
> For example, when *would* you want to explicitly spell out a namespace
> package __path__, and restrict it from seeing sys.path changes? I've not
> seen *anybody* ask for this feature in the context of setuptools; it's only
> ever been bug reports about when the more complicated implementation fails
> to detect an update.
>
> So, to wrap up:
>
> * The primary rationale for the feature is that "least surprise" for a new
> user to Python is that adding to sys.path should allow importing a portion
> of a namespace, whether or not you've already imported some other thing in
> that namespace. Symmetry with other languages and with other Python
> features (e.g. changing the working directory in an interactive
> interpreter) suggests it, and the removal of a similar timing dependency
> from PEP 402 (preventing direct import of a namespace-only package unless
> you imported a subpackage first) suggests that the same type of timing
> dependency should be removed here, too. (Note, for example, that I may not
> know that importing baz.spam indirectly causes some part of foo.wiz to be
> imported, and that if I then add another directory to sys.path containing a
> foo.* portion, my code will *no longer work* when I try to import foo.ham.
> This is much more "magical" behavior, in least-surprise terms!)
>
> * The constraints on sys.path and package __path__ objects can and should
> be removed, by making the dynamic path objects refer to a module and
> attribute, instead of directly referencing parent __path__ objects. Code
> that currently manipulates __path__ will not break, because such code will
> not be using PEP 420 namespace packages anyway (and so, __path__ will be a
> list. (Even so, the most common __path__ manipulation idiom is "__path__ =
> pkgutil.extend_path(...)" anyway!)
>
> * Namespace packages are a widely used feature of setuptools, and AFAIK
> nobody has *ever* asked to stop dynamic additions to namespace __path__,
> but a wide assortment of things people do with setuptools rely on dynamic
> additions under the hood. Providing the feature in PEP 420 gives a
> migration path away from setuptools, at least for this one feature.
> (Specifically, it does away with the need to use declare_namespace(), and
> the need to do all sys.path manipulation via setuptools' requirements API.)
>
> * Self-contained (__init__.py packages) and fixed __path__ lists can and
> should be considered the "magic" or "special case" parts of importing in
> Python 3, even though we're accustomed to them being central import
> concepts in Python 2. Modules and namespace packages can and should be the
> default case from an instructional POV, and sys.path updating should
> reflect this. (That is, future tutorials should introduce modules, then
> namespace packages, and finally self-contained packages with __init__ and
> __path__, because the *idea* of a namespace package doesn't depend on
> __path__ existing in the first place; it's essentially only a historical
> accident that self-contained packages were implemented in Python first.)
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
>
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
I agree the parent path should be retrieved by name rather than a direct
reference when checking the cache validity, though.

--
Sent from my phone, thus the relative brevity :)
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 5/21/2012 2:08 PM, PJ Eby wrote:
> On Mon, May 21, 2012 at 9:55 AM, Guido van Rossum <guido@python.org
> <mailto:guido@python.org>> wrote:
>
> Ah, I see. But I disagree that this is a reasonable constraint on
> sys.path. The magic __path__ object of a toplevel namespace module
> should know it is a toplevel module, and explicitly refetch sys.path
> rather than just keeping around a copy.
>
>
> That's fine by me - the class could actually be defined to take a module
> name and attribute (e.g. 'sys', 'path' or 'foo', '__path__'), and then
> there'd be no need to special case anything: it would behave exactly the
> same way for subpackages and top-level packages.

Any reason to make this the string "sys" or "foo", and not the module
itself? Can the module be replaced in sys.modules? Mostly I'm just curious.

But regardless, I'm okay with keeping these both as strings and looking
it up in sys.modules and then by attribute.

Eric.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 05/21/2012 07:25 PM, Nick Coghlan wrote:
> As a simple example to back up PJE's explanation, consider:
> 1. encodings becomes a namespace package
> 2. It sometimes gets imported during interpreter startup to initialise
> the standard io streams
> 3. An application modifies sys.path after startup and wants to
> contribute additional encodings
>
> Searching the entire parent path for new portions on every import would
> be needlessly slow.
>
> Not recognising new portions would be needlessly confusing for users. In
> our simple case above, the application would fail if the io
> initialisation accessed the encodings package, but work if it did not
> (e.g. when all streams are utf-8).
>
> PEP 420 splits the difference via an automatically invalidated cache:
> when you iterate over a namespace package __path__ object, it rescans
> the parent path for new portions *if and only if* the contents of the
> parent path have changed since the previous scan.

That seems like a pretty convincing example to me.

Personally I'm +1 on putting dynamic computation into the PEP, at least
for top-level namespace packages, and probably for all namespace packages.

The code is not very large or complicated, and with the proposed removal
of the restriction that sys.path cannot be replaced, I think it behaves
well.

But Guido can decide against it without hurting my feelings.

Eric.

P.S.: Here's the current code in the pep-420 branch. This code still has
the restriction that sys.path (or parent_path in general) can't be
replaced. I'll fix that if we decide to keep the feature.

class _NamespacePath:
def __init__(self, name, path, parent_path, path_finder):
self._name = name
self._path = path
self._parent_path = parent_path
self._last_parent_path = tuple(parent_path)
self._path_finder = path_finder

def _recalculate(self):
# If _parent_path has changed, recalculate _path
parent_path = tuple(self._parent_path) # Make a copy
if parent_path != self._last_parent_path:
loader, new_path = self._path_finder(self._name, parent_path)
# Note that no changes are made if a loader is returned, but we
# do remember the new parent path
if loader is None:
self._path = new_path
self._last_parent_path = parent_path # Save the copy
return self._path

def __iter__(self):
return iter(self._recalculate())

def __len__(self):
return len(self._recalculate())

def __repr__(self):
return "_NamespacePath" + repr((self._path, self._parent_path))

def __contains__(self, item):
return item in self._recalculate()



_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 12:51 AM, Eric V. Smith <eric@trueblade.com> wrote:
> That seems like a pretty convincing example to me.
>
> Personally I'm +1 on putting dynamic computation into the PEP, at least
> for top-level namespace packages, and probably for all namespace packages.

Same here, but Guido's right that the rationale (and example) should
be clearer in the PEP itself if the feature is to be retained.

> P.S.: Here's the current code in the pep-420 branch. This code still has
> the restriction that sys.path (or parent_path in general) can't be
> replaced. I'll fix that if we decide to keep the feature.

I wonder if it would be worth exposing an importlib.LazyRef API to
make it generally easy to avoid this kind of early binding problem?

class LazyRef:
# similar API to weakref.weakref
def __init__(self, modname, attr=None):
self.modname = modname
self.attr = attr
def __call__(self):
mod = sys.modules[self.modname]
attr = self.attr
if attr is None:
return mod
return getattr(mod, attr)

Then _NamespacePath could just be defined as taking a callable that
returns the parent path:


class _NamespacePath:
   def __init__(self, name, path, parent_path, path_finder):
       self._name = name
       self._path = path
       self._parent_path = parent_path
       self._last_parent_path = tuple(parent_path)
       self._path_finder = path_finder

   def _recalculate(self):
       # If _parent_path has changed, recalculate _path
       parent_path = tuple(self._parent_path())     # Retrieve and make a copy
       if parent_path != self._last_parent_path:
           loader, new_path = self._path_finder(self._name, parent_path)
           # Note that no changes are made if a loader is returned, but we
           #  do remember the new parent path
           if loader is None:
               self._path = new_path
           self._last_parent_path = parent_path   # Save the copy
       return self._path

Even if the LazyRef idea isn't used, I still like the idea of passing
a callable in to _NamespacePath for the parent path rather than
hardcoding the "module name + attribute name" approach.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 1:39 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
>     def _recalculate(self):
>         # If _parent_path has changed, recalculate _path
>         parent_path = tuple(self._parent_path())     # Retrieve and make a copy
>         if parent_path != self._last_parent_path:
>             loader, new_path = self._path_finder(self._name, parent_path)
>             # Note that no changes are made if a loader is returned, but we
>             #  do remember the new parent path
>             if loader is None:
>                 self._path = new_path
>             self._last_parent_path = parent_path   # Save the copy
>         return self._path

Oops, I also meant to say that it's probably worth at least issuing
ImportWarning if a new portion with an __init__.py gets added - it's
going to block all future dynamic updates of that namespace package.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 05/22/2012 11:39 AM, Nick Coghlan wrote:
> On Wed, May 23, 2012 at 12:51 AM, Eric V. Smith <eric@trueblade.com> wrote:
>> That seems like a pretty convincing example to me.
>>
>> Personally I'm +1 on putting dynamic computation into the PEP, at least
>> for top-level namespace packages, and probably for all namespace packages.
>
> Same here, but Guido's right that the rationale (and example) should
> be clearer in the PEP itself if the feature is to be retained.

Completely agreed. I'll work on it.

> Oops, I also meant to say that it's probably worth at least issuing
> ImportWarning if a new portion with an __init__.py gets added - it's
> going to block all future dynamic updates of that namespace package.

Right. That's on my list of things to clean up. It actually won't block
updates during this run of Python, though: once a namespace package,
always a namespace package. But if, on another run, that entry is on
sys.path, then yes, it will block all namespace package portions.

Eric.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Mon, May 21, 2012 at 8:32 PM, Eric V. Smith <eric@trueblade.com> wrote:

> Any reason to make this the string "sys" or "foo", and not the module
> itself? Can the module be replaced in sys.modules? Mostly I'm just curious.
>

Probably not, but it occurred to me that storing references to modules
introduces a reference cycle that wasn't there when we were pointing to
parent path objects instead. It basically would make child packages point
to their parents, as well as the other way around.
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
Okay, I've been convinced that keeping the dynamic path feature is a
good idea. I am really looking forward to seeing the rationale added
to the PEP -- that's pretty much the last thing on my list that made
me hesitate. I'll leave the details of exactly how the parent path is
referenced up to the implementation team (several good points were
made), as long as the restriction that sys.path must be modified in
place is lifted.

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Tue, May 22, 2012 at 12:31 PM, Eric V. Smith <eric@trueblade.com> wrote:

> On 05/22/2012 11:39 AM, Nick Coghlan wrote:
> > Oops, I also meant to say that it's probably worth at least issuing
> > ImportWarning if a new portion with an __init__.py gets added - it's
> > going to block all future dynamic updates of that namespace package.
>
> Right. That's on my list of things to clean up. It actually won't block
> updates during this run of Python, though: once a namespace package,
> always a namespace package. But if, on another run, that entry is on
> sys.path, then yes, it will block all namespace package portions.
>

This discussion has gotten me thinking: should we expose a
pkgutil.declare_namespace() API to allow such an __init__.py to turn itself
back into a namespace? (Per our previous discussion on transitioning
existing namespace packages.) It wouldn't need to do all the other stuff
that the setuptools version does, it would just be a way to transition away
from setuptools.

What it would do is:
1. Recursively invoke itself for parent packages
2. Create the module object if it doesn't already exist
3. Set the module __path__ to a _NamespacePath instance.

def declare_namespace(package_name):
parent, dot, tail = package_name.rpartition('.')
attr = '__path__'
if dot:
declare_namespace(parent)
else:
parent, attr = 'sys', 'path'
with importlockcontext:
module = sys.modules.get(package_name)
if module is None:
module = XXX new module here
module.__path__ = _NamespacePath(...stuff involving 'parent' and
'attr')

It may be that this should complain under certain circumstances, or use the
'__path__ = something' idiom, but the above approach would be (basically)
API compatible with the standard usage of declare_namespace.

Obviously, this'll only be useful for people who are porting code going
forward, but even if a different API is chosen, there still ought to be a
way for people to do it. Namespace packages are one of a handful of
features that are still basically setuptools-only at this point (i.e. not
yet provided by packaging/distutils2), but if it's the only setuptools-only
feature a project is using, they'd be able to drop their dependency as of
3.3.

(Next up, I guess we'll need an entry-points PEP, but that'll be another
discussion. ;-) )
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
Minor nit.

On May 22, 2012, at 04:43 PM, PJ Eby wrote:

>def declare_namespace(package_name):
> parent, dot, tail = package_name.rpartition('.')
> attr = '__path__'
> if dot:
> declare_namespace(parent)
> else:
> parent, attr = 'sys', 'path'
> with importlockcontext:
> module = sys.modules.get(package_name)

Best to use a marker object here instead of checking for None, since the
latter is a valid value for an existing entry in sys.modules.

> if module is None:
> module = XXX new module here
> module.__path__ = _NamespacePath(...stuff involving 'parent' and
>'attr')

Cheers,
-Barry
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 5/22/2012 2:37 PM, Guido van Rossum wrote:
> Okay, I've been convinced that keeping the dynamic path feature is a
> good idea. I am really looking forward to seeing the rationale added
> to the PEP -- that's pretty much the last thing on my list that made
> me hesitate. I'll leave the details of exactly how the parent path is
> referenced up to the implementation team (several good points were
> made), as long as the restriction that sys.path must be modified in
> place is lifted.

I've updated the PEP. Let me know how it looks.

I have not updated the implementation yet. I'm not exactly sure how I'm
going to convert from a path list of unknown origin to ('sys', 'path')
or ('foo', '__path__'). I'll look at it later tonight to see if it's
possible. I'm hoping it doesn't require major surgery to
importlib._bootstrap.

I still owe PEP updates for finder/loader examples and nested namespace
package examples. But I think that's all that's needed.

Eric.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Tue, May 22, 2012 at 8:40 PM, Eric V. Smith <eric@trueblade.com> wrote:

> On 5/22/2012 2:37 PM, Guido van Rossum wrote:
> > Okay, I've been convinced that keeping the dynamic path feature is a
> > good idea. I am really looking forward to seeing the rationale added
> > to the PEP -- that's pretty much the last thing on my list that made
> > me hesitate. I'll leave the details of exactly how the parent path is
> > referenced up to the implementation team (several good points were
> > made), as long as the restriction that sys.path must be modified in
> > place is lifted.
>
> I've updated the PEP. Let me know how it looks.
>

My name is misspelled in it, but otherwise it looks fine. ;-)

I have not updated the implementation yet. I'm not exactly sure how I'm
> going to convert from a path list of unknown origin to ('sys', 'path')
> or ('foo', '__path__'). I'll look at it later tonight to see if it's
> possible. I'm hoping it doesn't require major surgery to
> importlib._bootstrap.
>

It shouldn't - all you should need is to use
getattr(sys.modules[self.modname], self.attr) instead of referencing a
parent path object directly.

(The more interesting thing is what to do if the parent module goes away,
due to somebody deleting the module out of sys.modules. The simplest thing
to do would probably be to just keep using the cached value in that case.)

Ah crap, I just thought of something - what happens if you reload() a
namespace package? Probably nothing, but should we specify what sort of
nothing? ;-)
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 10:40 AM, Eric V. Smith <eric@trueblade.com> wrote:
> On 5/22/2012 2:37 PM, Guido van Rossum wrote:
>> Okay, I've been convinced that keeping the dynamic path feature is a
>> good idea. I am really looking forward to seeing the rationale added
>> to the PEP -- that's pretty much the last thing on my list that made
>> me hesitate. I'll leave the details of exactly how the parent path is
>> referenced up to the implementation team (several good points were
>> made), as long as the restriction that sys.path must be modified in
>> place is lifted.
>
> I've updated the PEP. Let me know how it looks.
>
> I have not updated the implementation yet. I'm not exactly sure how I'm
> going to convert from a path list of unknown origin to ('sys', 'path')
> or ('foo', '__path__'). I'll look at it later tonight to see if it's
> possible. I'm hoping it doesn't require major surgery to
> importlib._bootstrap.

If you wanted to do this without changing the sys.meta_path hook API,
you'd have to pass an object to find_module() that did the dynamic
lookup of the value in obj.__iter__. Something like:

class _LazyPath:
def __init__(self, modname, attribute):
self.modname = modname
self.attribute = attribute
def __iter__(self):
return iter(getattr(sys.module[self.modname], self.attribute))

A potentially cleaner alternative to consider is tweaking the
find_loader API spec so that it gets used at the meta path level as
well as at the path hooks level and is handed a *callable* that
dynamically retrieves the path rather than a direct reference to the
path itself.

The full signature of find_loader would then become:

def find_loader(fullname, get_path=None):
# fullname as for find_module
# When get_path is None, it means the finder is being called
as a path hook and
# should use the specific path entry passed to __init__
# In this case, namespace package portions are returned as
(None, portions)
# Otherwise, the finder is being called as a meta_path hook
and get_path() will return the relevant path
# Any namespace packages are then returned as (loader, portions)

There are two major consequences of this latter approach:
- the PEP 302 find_module API would now be a purely legacy interface
for both the meta_path and path_hooks, used only if find_loader is not
defined
- it becomes trivial to tell whether a particular name references a
package or not *without* needing to load it first: find_loader()
returns a non-empty iterable for the list of portions

That second consequence is rather appealing: it means you'd be able to
implement an almost complete walk of a package hierarchy *without*
having to import anything (although you would miss old-style namespace
packages and any other packages that alter their own __path__ in
__init__, so you may still want to load packages to make sure you
found everything. You could definitively answer the "is this a package
or not?" question without running any code, though).

The first consequence is also appealing, since the find_module() name
is more than a little misleading. The "find_module" name strongly
suggests that the method is expected to return a module object, and
that's just wrong - you actually find a loader, then you use that to
load the module.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Tue, May 22, 2012 at 9:58 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

> If you wanted to do this without changing the sys.meta_path hook API,
> you'd have to pass an object to find_module() that did the dynamic
> lookup of the value in obj.__iter__. Something like:
>
> class _LazyPath:
> def __init__(self, modname, attribute):
> self.modname = modname
> self.attribute = attribute
> def __iter__(self):
> return iter(getattr(sys.module[self.modname], self.attribute))
>
> A potentially cleaner alternative to consider is tweaking the
> find_loader API spec so that it gets used at the meta path level as
> well as at the path hooks level and is handed a *callable* that
> dynamically retrieves the path rather than a direct reference to the
> path itself.
>
> The full signature of find_loader would then become:
>
> def find_loader(fullname, get_path=None):
> # fullname as for find_module
> # When get_path is None, it means the finder is being called
> as a path hook and
> # should use the specific path entry passed to __init__
> # In this case, namespace package portions are returned as
> (None, portions)
> # Otherwise, the finder is being called as a meta_path hook
> and get_path() will return the relevant path
> # Any namespace packages are then returned as (loader, portions)
>
> There are two major consequences of this latter approach:
> - the PEP 302 find_module API would now be a purely legacy interface
> for both the meta_path and path_hooks, used only if find_loader is not
> defined
> - it becomes trivial to tell whether a particular name references a
> package or not *without* needing to load it first: find_loader()
> returns a non-empty iterable for the list of portions
>
> That second consequence is rather appealing: it means you'd be able to
> implement an almost complete walk of a package hierarchy *without*
> having to import anything (although you would miss old-style namespace
> packages and any other packages that alter their own __path__ in
> __init__, so you may still want to load packages to make sure you
> found everything. You could definitively answer the "is this a package
> or not?" question without running any code, though).
>
> The first consequence is also appealing, since the find_module() name
> is more than a little misleading. The "find_module" name strongly
> suggests that the method is expected to return a module object, and
> that's just wrong - you actually find a loader, then you use that to
> load the module.
>

While I see no problem with cleaning up the interface, I'm kind of lost as
to the point of making a get_path callable, vs. just using the iterable
interface you sketched. Python has iterables, so why add a call to get the
iterable, when iter() or a straight "for" loop will do effectively the same
thing?
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 1:58 PM, PJ Eby <pje@telecommunity.com> wrote:
> While I see no problem with cleaning up the interface, I'm kind of lost as
> to the point of making a get_path callable, vs. just using the iterable
> interface you sketched.  Python has iterables, so why add a call to get the
> iterable, when iter() or a straight "for" loop will do effectively the same
> thing?

Yeah, I'm not sure what I was thinking either, since just documenting
the interface and providing LazyPath as a public API somewhere in
importlib should suffice. Meta path hooks are already going to need to
tolerate being handed arbitrary iterables, since that's exactly what
namespace package path objects are going to be.

While I still like the idea of killing off find_module() completely
rather than leaving it in at the meta_path level, there's no reason
that needs to be done as part of PEP 420 itself. Instead, it can be
done later if anyone comes up with a concrete use case for access the
path details without loading packages and modules.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 05/22/2012 09:49 PM, PJ Eby wrote:
> On Tue, May 22, 2012 at 8:40 PM, Eric V. Smith <eric@trueblade.com
> <mailto:eric@trueblade.com>> wrote:
>
> On 5/22/2012 2:37 PM, Guido van Rossum wrote:
> > Okay, I've been convinced that keeping the dynamic path feature is a
> > good idea. I am really looking forward to seeing the rationale added
> > to the PEP -- that's pretty much the last thing on my list that made
> > me hesitate. I'll leave the details of exactly how the parent path is
> > referenced up to the implementation team (several good points were
> > made), as long as the restriction that sys.path must be modified in
> > place is lifted.
>
> I've updated the PEP. Let me know how it looks.
>
>
> My name is misspelled in it, but otherwise it looks fine. ;-)

Oops, sorry. Fixed (I think).

> I have not updated the implementation yet. I'm not exactly sure how I'm
> going to convert from a path list of unknown origin to ('sys', 'path')
> or ('foo', '__path__'). I'll look at it later tonight to see if it's
> possible. I'm hoping it doesn't require major surgery to
> importlib._bootstrap.
>
>
> It shouldn't - all you should need is to use
> getattr(sys.modules[self.modname], self.attr) instead of referencing a
> parent path object directly.

The problem isn't the lookup, it's coming up with self.modname and
self.attr. As it currently stands, PathFinder.find_module is given the
parent path, not the module name and attribute name used to look up the
parent path using sys.modules and getattr.

Eric.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 10:31 PM, Eric V. Smith <eric@trueblade.com> wrote:
> On 05/22/2012 09:49 PM, PJ Eby wrote:
>> It shouldn't - all you should need is to use
>> getattr(sys.modules[self.modname], self.attr) instead of referencing a
>> parent path object directly.
>
> The problem isn't the lookup, it's coming up with self.modname and
> self.attr. As it currently stands, PathFinder.find_module is given the
> parent path, not the module name and attribute name used to look up the
> parent path using sys.modules and getattr.

Right, that's what PJE and I were discussing. Instead of passing in
the path object directly, you can instead pass an object that *lazily*
retrieves the path object in its __iter__ method:

class LazyIterable:
"""On iteration, retrieves a reference to a named iterable and
returns an iterator over that iterable"""
def __init__(self, modname, attribute):
self.modname = modname
self.attribute = attribute
def __iter__(self):
mod = import_module(self.modname) # Will almost always get
a hit directly in sys.modules
return iter(getattr(mod, self.attribute)

Where importlib currently passes None or sys.path as the path argument
to find_module(), instead pass "LazyIterable('sys', 'path')" and where
it currently passes package.__path__, instead pass
"LazyIterable(package.__name__, '__path__')".

The existing for loop iteration and tuple() calls should then take
care of the lazy lookup automatically.

That way, the only code that needs to know the values of modname and
attribute is the code that already has access to those values.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On 05/23/2012 09:02 AM, Nick Coghlan wrote:
> On Wed, May 23, 2012 at 10:31 PM, Eric V. Smith <eric@trueblade.com> wrote:
>> On 05/22/2012 09:49 PM, PJ Eby wrote:
>>> It shouldn't - all you should need is to use
>>> getattr(sys.modules[self.modname], self.attr) instead of referencing a
>>> parent path object directly.
>>
>> The problem isn't the lookup, it's coming up with self.modname and
>> self.attr. As it currently stands, PathFinder.find_module is given the
>> parent path, not the module name and attribute name used to look up the
>> parent path using sys.modules and getattr.
>
> Right, that's what PJE and I were discussing. Instead of passing in
> the path object directly, you can instead pass an object that *lazily*
> retrieves the path object in its __iter__ method:

Hey, one message at a time! I'm just reading those now.

I'd like to hear Brett's comments on this approach.

Eric.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On May 23, 2012 9:02 AM, "Nick Coghlan" <ncoghlan@gmail.com> wrote:
>
> On Wed, May 23, 2012 at 10:31 PM, Eric V. Smith <eric@trueblade.com>
wrote:
> > On 05/22/2012 09:49 PM, PJ Eby wrote:
> >> It shouldn't - all you should need is to use
> >> getattr(sys.modules[self.modname], self.attr) instead of referencing a
> >> parent path object directly.
> >
> > The problem isn't the lookup, it's coming up with self.modname and
> > self.attr. As it currently stands, PathFinder.find_module is given the
> > parent path, not the module name and attribute name used to look up the
> > parent path using sys.modules and getattr.
>
> Right, that's what PJE and I were discussing. Instead of passing in
> the path object directly, you can instead pass an object that *lazily*
> retrieves the path object in its __iter__ method:
>
> class LazyIterable:
> """On iteration, retrieves a reference to a named iterable and
> returns an iterator over that iterable"""
> def __init__(self, modname, attribute):
> self.modname = modname
> self.attribute = attribute
> def __iter__(self):
> mod = import_module(self.modname) # Will almost always get
> a hit directly in sys.modules
> return iter(getattr(mod, self.attribute)
>
> Where importlib currently passes None or sys.path as the path argument
> to find_module(), instead pass "LazyIterable('sys', 'path')" and where
> it currently passes package.__path__, instead pass
> "LazyIterable(package.__name__, '__path__')".
>
> The existing for loop iteration and tuple() calls should then take
> care of the lazy lookup automatically.
>
> That way, the only code that needs to know the values of modname and
> attribute is the code that already has access to those values.

Perhaps calling it a ModulePath instead of a LazyIterable would be better?

Also, this is technically a change from PEP 302, which says the actual
sys.path or __path__ are passed to find_module(). I'm not sure whether any
find_module() code ever written actually *cares* about this, though.
(Especially if, as I believe I understand in this context, we're only
talking about meta-importers.)
Re: PEP 420 - dynamic path computation is missing rationale [ In reply to ]
On Wed, May 23, 2012 at 9:10 AM, Eric V. Smith <eric@trueblade.com> wrote:

> On 05/23/2012 09:02 AM, Nick Coghlan wrote:
> > On Wed, May 23, 2012 at 10:31 PM, Eric V. Smith <eric@trueblade.com>
> wrote:
> >> On 05/22/2012 09:49 PM, PJ Eby wrote:
> >>> It shouldn't - all you should need is to use
> >>> getattr(sys.modules[self.modname], self.attr) instead of referencing a
> >>> parent path object directly.
> >>
> >> The problem isn't the lookup, it's coming up with self.modname and
> >> self.attr. As it currently stands, PathFinder.find_module is given the
> >> parent path, not the module name and attribute name used to look up the
> >> parent path using sys.modules and getattr.
> >
> > Right, that's what PJE and I were discussing. Instead of passing in
> > the path object directly, you can instead pass an object that *lazily*
> > retrieves the path object in its __iter__ method:
>
> Hey, one message at a time! I'm just reading those now.
>
> I'd like to hear Brett's comments on this approach.


If I understand the proposal correctly, this would be a change in
NamespaceLoader in how it sets __path__ and in no way affect any other code
since __import__() just grabs the object on __path__ and passes as an
argument to the meta path finders which just iterate over the object, so I
have no objections to it.

-Brett

1 2  View All