Asp Forum - Matz says namespaces are too hard to implement - why?

Stefan Rusterholz

12/22/2007 3:18:00 AM

Short primer: What are namespaces?
Matz talked (in 2005 I think) about a concept he named 'namespaces',
they would allow you to monkey patch e.g. Array in namespace :foo
without the patch being visible in namespace :bar, essentially to make
namespacing safe. He didn't implement it because it's too hard to.

I wonder why it is too hard, given that Class#clone exists and shadowing
constants within a module works fine. I wrote up a small proof of
concept which still has problems and could be extended further. It's
pure ruby, the problems should be solvable at C level.
The probably ugliest part is due to constant lookup rules in blocks, I
had to use string eval.

The code is nicely formatted viewable on http://pastie.caboo...

== Example code
require 'namespaces'
namespace :foo, %{
class Array
def bar
"bar"
end
end
p Array.new.bar
}

namespace :bar, %{
p Array.new.bar
}

== Example output
"bar"
testnamespaces.rb:xx:in `create': undefined method `bar' for []:Array
(NoMethodError)
from testnamespaces.rb:xx:in `namespace'
from testnamespaces.rb:xx

== namespaces.rb
module Kernel
def namespace(name, code)
Namespace.exist?(name) ? Namespace[name].module_eval(code) :
Namespace.create(name, code)
end
end

module Namespace
@space = {}

class <<self
def exist?(name)
@space.has_key?(name)
end

def [](name)
@space[name]
end

def create(name, code)
name = name.to_s.upcase
raise "Namespace '#{name}' exists already" if
@space.has_key?(name)
space = const_set(name, Module.new)
@space[name] = space
Object.constants.each { |c|
begin
@space[name].const_set(c, v.clone)
rescue; end
}
space.module_eval(code)
space
end
end
end

Problems that a pure ruby solution suffers and things that I didn't yet
add:
-Concurrency
-Literals
-Importing from other namespaces
-Recursively clone constants
-Blocks have lookup rules tied to definition context even with
module_eval, hence string evals

So now I wonder, am I missing something? Is the devil in the details, or
is one of the incomplete things in my proof of concept the show stopper?
I see that it would be some work, but I don't see how it would be hard.
Please enlighten me :)

Regards
Stefan
--
Posted via http://www.ruby-....

38 Answers

Stefan Rusterholz

12/22/2007 3:21:00 AM

Stefan Rusterholz wrote:
> essentially to make namespacing safe.

Should of course have been: "essentially to make monkey patching safe."
Sorry, it's late here :-/

Regards
Stefan
--
Posted via http://www.ruby-....

James Britt

12/22/2007 4:23:00 AM

Stefan Rusterholz wrote:
> Stefan Rusterholz wrote:
>> essentially to make namespacing safe.
>
> Should of course have been: "essentially to make monkey patching safe."

Monkey patching?

--
James Britt

"The greatest obstacle to discovery is not ignorance, but the illusion
of knowledge."
- D. Boorstin

gemblon (t.b.)

12/22/2007 5:30:00 AM

James Britt wrote:
> Stefan Rusterholz wrote:
>> Stefan Rusterholz wrote:
>>> essentially to make namespacing safe.
>>
>> Should of course have been: "essentially to make monkey patching safe."
>
> Monkey patching?
>
> --
> James Britt

somebody get me a bandaide. lol.

--
Posted via http://www.ruby-....

Robert Klemme

12/22/2007 11:31:00 AM

On 22.12.2007 04:18, Stefan Rusterholz wrote:
> Short primer: What are namespaces?
> Matz talked (in 2005 I think) about a concept he named 'namespaces',
> they would allow you to monkey patch e.g. Array in namespace :foo
> without the patch being visible in namespace :bar, essentially to make
> namespacing safe. He didn't implement it because it's too hard to.
>
> I wonder why it is too hard, given that Class#clone exists and shadowing
> constants within a module works fine. I wrote up a small proof of
> concept which still has problems and could be extended further. It's
> pure ruby, the problems should be solvable at C level.
> The probably ugliest part is due to constant lookup rules in blocks, I
> had to use string eval.
>
> The code is nicely formatted viewable on http://pastie.caboo...
>
> == Example code
> require 'namespaces'
> namespace :foo, %{
> class Array
> def bar
> "bar"
> end
> end
> p Array.new.bar
> }
>
> namespace :bar, %{
> p Array.new.bar
> }
>
> == Example output
> "bar"
> testnamespaces.rb:xx:in `create': undefined method `bar' for []:Array
> (NoMethodError)
> from testnamespaces.rb:xx:in `namespace'
> from testnamespaces.rb:xx
>
> == namespaces.rb
> module Kernel
> def namespace(name, code)
> Namespace.exist?(name) ? Namespace[name].module_eval(code) :
> Namespace.create(name, code)
> end
> end
>
> module Namespace
> @space = {}
>
> class <<self
> def exist?(name)
> @space.has_key?(name)
> end
>
> def [](name)
> @space[name]
> end
>
> def create(name, code)
> name = name.to_s.upcase
> raise "Namespace '#{name}' exists already" if
> @space.has_key?(name)
> space = const_set(name, Module.new)
> @space[name] = space
> Object.constants.each { |c|
> begin
> @space[name].const_set(c, v.clone)
> rescue; end
> }
> space.module_eval(code)
> space
> end
> end
> end
>
> Problems that a pure ruby solution suffers and things that I didn't yet
> add:
> -Concurrency
> -Literals
> -Importing from other namespaces
> -Recursively clone constants
> -Blocks have lookup rules tied to definition context even with
> module_eval, hence string evals
>
> So now I wonder, am I missing something? Is the devil in the details, or
> is one of the incomplete things in my proof of concept the show stopper?
> I see that it would be some work, but I don't see how it would be hard.
> Please enlighten me :)

Ultimately Matz will be the one who is able to answer this properly. My
gut guess is that - besides the issues you mentioned - performance is a
critical issue. Method lookups are so ubiquitous that everything that
slows them down should be limited as far as possible. And method lookup
will certainly suffer because the set of allowed methods is no longer
determined by a class but also by a location in the code where a method
is invoked. Especially in the light of nesting these selector
namespaces lookups will become complex and potentially slow.

Kind regards

robert

Charles Oliver Nutter

12/22/2007 12:21:00 PM

Robert Klemme wrote:
> Ultimately Matz will be the one who is able to answer this properly. My
> gut guess is that - besides the issues you mentioned - performance is a
> critical issue. Method lookups are so ubiquitous that everything that
> slows them down should be limited as far as possible. And method lookup
> will certainly suffer because the set of allowed methods is no longer
> determined by a class but also by a location in the code where a method
> is invoked. Especially in the light of nesting these selector
> namespaces lookups will become complex and potentially slow.

Or perhaps, the various implementers will be able to answer this
properly as well :)

So here's a long answer:

Selector namespaces would be difficult to implement largely because of
how method dispatch works. Currently, the metaclass of the target object
is 100% in control. It decides which method will be called in all cases,
and the caller is not able to influence that selection process in any
way. So we have to monkey patch the actual metaclass to have new
behavior show up for all future calls, with the down side of course
being that new behavior shows up for all future calls...you can't choose
that some paths see new behavior and some see old.

Groovy is an example of a modern language that supports selector
namespaces, which they call Categories. Categories allow you to apply a
set of metaclass changes to one (or more?) class on a specific thread
within a specific scope. So you can add behavior to String for a block
of code and all code it calls, and when the block finishes the behavior
disappears.

And it's dog slow, even compared to normal non-Category Groovy calls
which aren't particularly fast to begin with.

The way it's implemented in Groovy leverages the fact that all
invocations in Groovy go first to an intermediary that decides whether
the target object's metaclass should have the last say or not. If a
Category is in play, all such intermediaries will allow the Category to
have first grab at method invocation, at which time they can pretend the
target metaclass has the new behavior. Otherwise, calls pass straight
through to the metaclass.

The problem with this is that when you're *not* using a Category, you
still have to constantly check for each invocation whether a Category
has been installed. Check, check, check, check, check, check...wasting
cycles. It also adds multiple additional layers into method invocation
when you *are* in a Category, since it stacks all the Category logic on
top of the already heavy method invocation logic.

If not for Categories, no intermediary would be needed, and no checks
would be needed.

I'd expect similar semantics in Ruby, since non-thread-local namespaces
would be almost worthless (threads would see behavior on types change
forward and back at arbitrary times), and as a result similar
performance implications. Even worse, it would require shoehorning an
intermediary into the Ruby call path, further slowing it down and
complicating optimization by VMs like the JVM and CLR. And it would add
in all the same checks, since every call would have to check whether a
namespace has been installed before proceeding.

- Charlie

Trans

12/22/2007 1:52:00 PM

Doesn't _why have an interesting approach of this problem? I forget
what it is called. I think he basically "objectified" the whole of
Ruby so it was reusable.

Also, I recently posted this to core (doubt it would ease the
implementation issues however):

Would it be possible to do selector namespaces on a file basis? That
is to say, load a library such that it would only apply to the
immediate file and no other? For example lets say I have:

# round.rb
class Float
def round
0
end
end

# foo.rb
n = 1.23
puts n.round

#boo.rb
selector_require "round" # hypothetical
require "foo"
n = 1.234
puts n.to_f

running boo.rb, we'd get:

1
0

The 1 comes from foo.rb, but the 0 from boo.rb because we "selectively
loaded" round.rb.

I've never been satisfied with block-oriented approaches often cited.
Is this perhaps a more useful approach? Or does this have problems of
it own?

T.

Stefan Rusterholz

12/22/2007 2:21:00 PM

> Or perhaps, the various implementers will be able to answer this
> properly as well :)
>
> So here's a long answer:
>
> Selector namespaces would be difficult to implement largely because of
> how method dispatch works. Currently, the metaclass of the target object
> is 100% in control. It decides which method will be called in all cases,
> and the caller is not able to influence that selection process in any
> way. So we have to monkey patch the actual metaclass to have new
> behavior show up for all future calls, with the down side of course
> being that new behavior shows up for all future calls...you can't choose
> that some paths see new behavior and some see old.
>
> Groovy is an example of a modern language that supports selector
> namespaces, which they call Categories. Categories allow you to apply a
> set of metaclass changes to one (or more?) class on a specific thread
> within a specific scope. So you can add behavior to String for a block
> of code and all code it calls, and when the block finishes the behavior
> disappears.
>
> And it's dog slow, even compared to normal non-Category Groovy calls
> which aren't particularly fast to begin with.
>
> The way it's implemented in Groovy leverages the fact that all
> invocations in Groovy go first to an intermediary that decides whether
> the target object's metaclass should have the last say or not. If a
> Category is in play, all such intermediaries will allow the Category to
> have first grab at method invocation, at which time they can pretend the
> target metaclass has the new behavior. Otherwise, calls pass straight
> through to the metaclass.
>
> The problem with this is that when you're *not* using a Category, you
> still have to constantly check for each invocation whether a Category
> has been installed. Check, check, check, check, check, check...wasting
> cycles. It also adds multiple additional layers into method invocation
> when you *are* in a Category, since it stacks all the Category logic on
> top of the already heavy method invocation logic.
>
> If not for Categories, no intermediary would be needed, and no checks
> would be needed.
>
> I'd expect similar semantics in Ruby, since non-thread-local namespaces
> would be almost worthless (threads would see behavior on types change
> forward and back at arbitrary times), and as a result similar
> performance implications. Even worse, it would require shoehorning an
> intermediary into the Ruby call path, further slowing it down and
> complicating optimization by VMs like the JVM and CLR. And it would add
> in all the same checks, since every call would have to check whether a
> namespace has been installed before proceeding.
>
> - Charlie

I fail to see why namespaces would have to be per Thread. I understand
my code and what methods are available as a lexical issue. Similar as
when I require a file, the provided classes will be available in every
thread, I'd expect a change in a namespace to be local to the lexical
scope of the namespace.

Is there a good example of where and why per-thread namespaces would
make it that much more useful you imply, so I can follow your argument
of thread unaware namespaces being worthless?

Also I fail to see how lookup would become more complicated. Take a look
at the approach I took. I copy the classes (on interpreter level one
could use COW techniques to reduce memory usage) into the new namespace.
The numbers of levels to look a method up or the way to look it up does
in no way change. Or do I miss something there?

Regards
Stefan
--
Posted via http://www.ruby-....

Charles Oliver Nutter

12/22/2007 2:22:00 PM

Trans wrote:
> Doesn't _why have an interesting approach of this problem? I forget
> what it is called. I think he basically "objectified" the whole of
> Ruby so it was reusable.

Yes, sandbox. There's also discussions starting now about multi-VM
support in a future Ruby 1.9/2.0 version. Both allow you to isolate a
"sub-ruby" from changing things in the "super-ruby" but they're far more
isolated than a selector namespace would be. Basically, in sandbox (and
in the JRuby equivalents) you have to marshal data between the two
"rubies" as though they were separate processes. Hardly a seamless
integration for namespacing, but useful for other domains (_why
demonstrated multiple Rails apps in the same process, for example).

> Also, I recently posted this to core (doubt it would ease the
> implementation issues however):
>
> Would it be possible to do selector namespaces on a file basis? That
> is to say, load a library such that it would only apply to the
> immediate file and no other? For example lets say I have:

I'd say yes...but with a caveat: the namespace would only apply to
invocations within that file. I think in general it's expected that a
selector namespace would affect called code as well in the same thread.
But perhaps that wasn't intended by Matz's initial proposals of selector
namespace behavior?

Maybe we need to get our heads around what selector namespaces should
actually *be* first...

> # round.rb
> class Float
> def round
> 0
> end
> end
>
> # foo.rb
> n = 1.23
> puts n.round
>
> #boo.rb
> selector_require "round" # hypothetical
> require "foo"
> n = 1.234
> puts n.to_f
>
> running boo.rb, we'd get:
>
> 1
> 0
>
> The 1 comes from foo.rb, but the 0 from boo.rb because we "selectively
> loaded" round.rb.

This is an example of something I'd expect to *not* work, because
"round" is not called after the namespace is installed. Did you mean to
call n.round at the bottom? That I would expect to work...but no calls
to round outside this file would see the namespace.

This version would also suffer from constantly checking if a namespace
has been installed, since the calls after are independent and
selector_require would presumably be just another method call. However a
file pragma that says "this file operates under a given selector"
would probably work well...since all calls in that file could be
decorated with namespace checks during parse.

> I've never been satisfied with block-oriented approaches often cited.
> Is this perhaps a more useful approach? Or does this have problems of
> it own?

Blocks at least allow the interpreter to say "within this context, use
this namespace" and provide an "off" point where the namespace goes
away. I'm not sure either approach is better than the other, but the
pragma is probably the least impact to performance (and maybe the least
useful).

In the end the biggest problem that namespaces introduce is that dynamic
invocation becomes...even more dynamic, since every call can suddenly
take a path completely unrelated to the object being invoked if the
namespace decides it should do so. This is the crux of the issue in
Groovy, and until there's a way to make namespaces have zero perf impact
on non-namespaced code I'd vote to keep them out.

But I still think it's worth discussing exactly what they should do and
how they should work.

- Charlie

Charles Oliver Nutter

12/22/2007 2:51:00 PM

Stefan Rusterholz wrote:
>> Or perhaps, the various implementers will be able to answer this
>> properly as well :)
>>
>> So here's a long answer:
>>
>> Selector namespaces would be difficult to implement largely because of
>> how method dispatch works. Currently, the metaclass of the target object
>> is 100% in control. It decides which method will be called in all cases,
>> and the caller is not able to influence that selection process in any
>> way. So we have to monkey patch the actual metaclass to have new
>> behavior show up for all future calls, with the down side of course
>> being that new behavior shows up for all future calls...you can't choose
>> that some paths see new behavior and some see old.
>>
>> Groovy is an example of a modern language that supports selector
>> namespaces, which they call Categories. Categories allow you to apply a
>> set of metaclass changes to one (or more?) class on a specific thread
>> within a specific scope. So you can add behavior to String for a block
>> of code and all code it calls, and when the block finishes the behavior
>> disappears.
>>
>> And it's dog slow, even compared to normal non-Category Groovy calls
>> which aren't particularly fast to begin with.
>>
>> The way it's implemented in Groovy leverages the fact that all
>> invocations in Groovy go first to an intermediary that decides whether
>> the target object's metaclass should have the last say or not. If a
>> Category is in play, all such intermediaries will allow the Category to
>> have first grab at method invocation, at which time they can pretend the
>> target metaclass has the new behavior. Otherwise, calls pass straight
>> through to the metaclass.
>>
>> The problem with this is that when you're *not* using a Category, you
>> still have to constantly check for each invocation whether a Category
>> has been installed. Check, check, check, check, check, check...wasting
>> cycles. It also adds multiple additional layers into method invocation
>> when you *are* in a Category, since it stacks all the Category logic on
>> top of the already heavy method invocation logic.
>>
>> If not for Categories, no intermediary would be needed, and no checks
>> would be needed.
>>
>> I'd expect similar semantics in Ruby, since non-thread-local namespaces
>> would be almost worthless (threads would see behavior on types change
>> forward and back at arbitrary times), and as a result similar
>> performance implications. Even worse, it would require shoehorning an
>> intermediary into the Ruby call path, further slowing it down and
>> complicating optimization by VMs like the JVM and CLR. And it would add
>> in all the same checks, since every call would have to check whether a
>> namespace has been installed before proceeding.
>>
>> - Charlie
>
> I fail to see why namespaces would have to be per Thread. I understand
> my code and what methods are available as a lexical issue. Similar as
> when I require a file, the provided classes will be available in every
> thread, I'd expect a change in a namespace to be local to the lexical
> scope of the namespace.
>
> Is there a good example of where and why per-thread namespaces would
> make it that much more useful you imply, so I can follow your argument
> of thread unaware namespaces being worthless?

They wouldn't "have to be" but that would probably be the most useful.
If I have a namespace that changes String#to_s and I call a library that
calls String#to_s, don't I want that library to see my change?

Your version and Trans's example would work fine for very localized
namespacing, which would have much lower implementation impact (and may
also be useful).

> Also I fail to see how lookup would become more complicated. Take a look
> at the approach I took. I copy the classes (on interpreter level one
> could use COW techniques to reduce memory usage) into the new namespace.
> The numbers of levels to look a method up or the way to look it up does
> in no way change. Or do I miss something there?

The complication is that under normal circumstances a method has no
knowledge of whether it's being called inside a namespace or not, since
the installation of the namespace itself is just another method call. So
every method in the system would have to check whether they are being
called under a namespace.

Your code gets around that by essentially delaying the parse until a
namespace is already installed. While this works, and allows namespacing
within that subcontext, you lose all the benefits of having code only
get parsed once. eval is *very* expensive, even more expensive than
installing per-call namespace checks throughout the system.

- Charlie

Trans

12/22/2007 3:00:00 PM

On Dec 22, 8:52 am, Trans <transf...@gmail.com> wrote:

> # round.rb
> class Float
> def round
> 0
> end
> end
>
> # foo.rb
> n = 1.23
> puts n.round
>
> #boo.rb
> selector_require "round" # hypothetical
> require "foo"
> n = 1.234
> puts n.to_f

of course, "n.to_f" should be "n.round".

T.

comp.lang.ruby

Matz says namespaces are too hard to implement - why?

Stefan Rusterholz

Stefan Rusterholz

James Britt

gemblon (t.b.)

Robert Klemme

Charles Oliver Nutter

Trans

Stefan Rusterholz

Charles Oliver Nutter

Charles Oliver Nutter

Trans

x Login to ForumsZone