[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: On Symbols

Amr Malik

1/5/2006 5:27:00 AM

The longish 'foo' vs :foo thread was helpful to me as a "nuby" in terms
of understanding the ruby Symbols. Personally, I like to see the dirty
details of what happens when a symbol is encountered in teh token
stream:

For example, for me it would be instructional to hear from a Ruby expert
as to what the interpreter does when it sees :foo or some such.

I don't think the discussions of lower level details are all bad. Most
of us maybe new to Ruby, but most if not all, have quite a few years in
computing, so there might still be instructional value in giving the
details on what exactly happens when a ruby 'Symbol' is encountered in
the token stream.

It might benefit those who wish to look a little deeper.

thanks to all the experts for taking the time to go over this for the
benefit of us newcomers. (and then going the extra mile to ask whether
it made sense.. now that's truly refreshing to see in an online
community) :)

Cheers,

-A



Devin Mullins wrote:
> Hey, all you lurkers:
>
> Have any of the explanations in the thread (What is the difference
> between :foo and "foo" ?) helped you understand symbols? A combination
> of them? Or a combination of a couple of explanations, and an irb
> session? Which ones? Why? If you don't want to get involved in (what
> seems to be turning into) a flamewar, email me personally (hint: don't
> click "Reply" :P), and I'll compile the results anonymously.
>
> I think we all agree that everybody learns differently, and so I think
> we're in dire need of feedback from somebody other than Chad (no
> offense).
>
> Devin

--
Posted via http://www.ruby-....


8 Answers

Evan Webb

1/5/2006 7:22:00 AM

0

Here is a short breakdown of how symbols (and other immediates are
implemented):

A variable holds a value. That value is a integer. The value of the
integer determines what it means. For example:

If the integer is odd, then the remaining bits of the integer are a
Fixnum value.

This means that if you do

a = 0

the interpreter stores in the local variable table the value
0x00000001. If you had assigned 4 to a, then the value would be
0x00000041. This allows for all Fixnums to not require additional
memory to represent. The same goes for true, false, nil, and symbols.
For the first 3, they are:

Name Backend Integer Value
=========================
false 0
true 2
nil 4
undef 6 (This isnt accessible from native ruby code, but is used
internally)

For symbols, the least significant byte is 0x0e and the upper 3 bytes
are a integer. The integer is uniquely assigned value for a string.
Think about it as the table index for a string. By using a symbol, you
basically allocate a string once and then refer to it by the index it
occupies in a special symbol table. For example, the first time the
symbol :evan is seen, a string containing "evan" is created and stored
in the symbol table at, say, index 9323. The variable that was assigned
:evan gets assigned
((9223 << 8 ) | 0x0e). The next time :evan is seen, "evan" is looked up
in the symbol table to obtain 9232 again.

So, to review:

a = :evan

a.to_i # => 9232 (the index in the symbol table)
a.object_id # => 2363406 (the index << 8 | 0x0e)
a.to_s # => the reference to the string object located at
index 9232 in the symbol table

The ruby runtime rules take the integer value and apply the rules to
determine what it means. If it's odd, it's a Fixnum immediate value. If
it's 0,2,4, or 6, it's a "core" immediate value. If it has the LSB is
0x0e, it's a symbol. Otherwise, it's a pointer to a memory address that
holds the information about the object.

Thats the end of days leason! Hope it helps!

Evan Webb // evan@fallingsnow.net

Tim Hunter

1/5/2006 12:25:00 PM

0

evanwebb@gmail.com wrote:
>
> Thats the end of days leason! Hope it helps!
>
> Evan Webb // evan@fallingsnow.net
>

Nicely done, Evan! Very good explanation. Thanks!

Jacob Fugal

1/5/2006 4:46:00 PM

0

If I may make one correction/clarification to an otherwise *excellent*
explanation of the implementation...

On 1/5/06, evanwebb@gmail.com <evanwebb@gmail.com> wrote:
> For symbols, the least significant byte is 0x0e and the upper 3 bytes
> are a integer. The integer is uniquely assigned value for a string.
> Think about it as the table index for a string. By using a symbol, you
> basically allocate a string once and then refer to it by the index it
> occupies in a special symbol table. For example, the first time the
> symbol :evan is seen, a string containing "evan" is created and stored

Clarification: a *C* string containing "evan" is created...

> in the symbol table at, say, index 9323. The variable that was assigned
> :evan gets assigned
> ((9223 << 8 ) | 0x0e). The next time :evan is seen, "evan" is looked up
> in the symbol table to obtain 9232 again.
>
> So, to review:
>
> a = :evan
>
> a.to_i # => 9232 (the index in the symbol table)
> a.object_id # => 2363406 (the index << 8 | 0x0e)
> a.to_s # => the reference to the string object located at
> index 9232 in the symbol table

Correction: a.to_s returns a reference to a new String object
containing the same sequence of characters as the C string in the
symbol table. This is visible when you compare the result of
#object_id on subsequent calls to a.to_s:

irb> a = :evan # => :evan
irb> a.to_s.object_id # => 1657424
irb> a.to_s.object_id # => 1653204

> The ruby runtime rules take the integer value and apply the rules to
> determine what it means. If it's odd, it's a Fixnum immediate value. If
> it's 0,2,4, or 6, it's a "core" immediate value. If it has the LSB is
> 0x0e, it's a symbol. Otherwise, it's a pointer to a memory address that
> holds the information about the object.
>
> Thats the end of days leason! Hope it helps!

Thanks, Evan! Aside from those minor, pedantic corrections, it was
indeed an excellent lesson.

Jacob Fugal


Amr Malik

1/5/2006 6:48:00 PM

0

evanwebb@gmail.com wrote:
> Here is a short breakdown of how symbols (and other immediates are
> implemented):
>
...

> By using a symbol, you
> basically allocate a string once and then refer to it by the index it
> occupies in a special symbol table. For example, the first time the
> symbol :evan is seen, a string containing "evan" is created and stored
> in the symbol table at, say, index 9323. The variable that was assigned
> :evan gets assigned
> ((9223 << 8 ) | 0x0e). The next time :evan is seen, "evan" is looked up
> in the symbol table to obtain 9232 again.
>
> So, to review:
>
> a = :evan
>
> a.to_i # => 9232 (the index in the symbol table)
> a.object_id # => 2363406 (the index << 8 | 0x0e)
> a.to_s # => the reference to the string object located at
> index 9232 in the symbol table
>
> The ruby runtime rules take the integer value and apply the rules to
> determine what it means. If it's odd, it's a Fixnum immediate value. If
> it's 0,2,4, or 6, it's a "core" immediate value. If it has the LSB is
> 0x0e, it's a symbol. Otherwise, it's a pointer to a memory address that
> holds the information about the object.
>
> Thats the end of days leason! Hope it helps!
>
> Evan Webb // evan@fallingsnow.net

Thanks Evan, much appreciated!

This is how I understand it now:

1. When the interpreter sees :<string> it looks in the "symbol table"

2. if it finds the value, it returns the int index (or the computed
object_id?) of it otherwise creates a new entry

3. somestringofchars.object_id returns something which is a function (in
mathematical sense) of the index of somestringofchars in the symbol
table. (i.e., indexofsymbolstring << 8 | 0xE0 )

4. ':' is just a way of giving the interpreter the heads up that a
symbol is coming up in the token stream (ie we think we know what we're
talking about, can you please look it up in the symbol table? nice
interpreter.. nice interpreter)

5. some object_id values are computed differently, for example the
session below (I don't know why, its a hole in my understanding of how
object_id's are assigned):

irb(main):054:0> false.object_id
=> 0
irb(main):055:0> 0.object_id
=> 1
irb(main):056:0> true.object_id
=> 2
irb(main):057:0> 1.object_id
=> 3
irb(main):058:0> nil.object_id
=> 4
irb(main):059:0> def.object_id # not sure why
irb(main):060:1> undef.object_id # possibly cuz its strictly
internal

6. Whether a token is valid or not, it gets added to the symbol table
and an object_id _can be_ computed from the symbol based on what type
of symbol it is (only if its a valid object methinks). Otherwise an
error is thrown.

8. there is a separate table which holds the variables. I'm not sure if
this is true from what I've seen in irb, it looks like a variable's
symbol gets stored in the symbol table as well or at least a symbol
which may point to its value location.

:8.1 Every time we refer to a variable var , the interpreter uses the
:var thingy
so if we did xx="Hi there", there would be an :xx created, but I don't
know how to get to "Hi there" from :xx (:xx.to_s just gives me "xx")

9. a symbol is just an atomic representation (AFA-the user-IC) of a
token added to the symbol table and exposed via the Symbol class so we
can use it if we want to instead of creating new string objects for
referring to things like methods etc and incurring needless overhead
(however small it might be).

10. I'm guessing Ruby interpreter needs symbols for its own housekeeping
(obviously) but the implementers were just being nice and allowed end
users to use them too for certain specific situations (I can't think of
a good example ).

So, basically, the first thing the interpreter does is, it takes the
token and stuffs it in the symbol table, then it figures out what to do
with it (steps 1..n) . And since we have access to the symbol for a
given reference, why not use that instead of referring to it via a
string object which gets created anew every time we referr to it. Even
though the end result is the same.

thanks,

-A


--
Posted via http://www.ruby-....


Evan Webb

1/5/2006 8:30:00 PM

0

Jacob,

Correct. The symbol table holds pointers to C strings and new String
objects are created with each call to Symbol#to_s.

I should note that Symbols are native ruby access to the ruby runtime
ID type. It's this reason that C strings are stored in the symbol
table, because the C functions for using ID's (rb_intern and
rb_id2name) use/return char * and ID.

Simon Kröger

1/5/2006 8:41:00 PM

0

Just because there seems to be so much confusion i would like to
point out some minor flaws in your post so nobody else stumble over
them. (Correct me please if the error is on my side)

evanwebb@gmail.com wrote:
> Here is a short breakdown of how symbols (and other immediates are
> implemented):
>
> A variable holds a value. That value is a integer. The value of the
> integer determines what it means. For example:
>
> If the integer is odd, then the remaining bits of the integer are a
> Fixnum value.
>
> This means that if you do
>
> a = 0
>
> the interpreter stores in the local variable table the value
> 0x00000001. If you had assigned 4 to a, then the value would be
> 0x00000041. This allows for all Fixnums to not require additional

no, i think it would store 0x00000009 because only the first bit
is reserved, not the first nibble. ( 4 << 1 | 1)

> memory to represent. The same goes for true, false, nil, and symbols.
> For the first 3, they are:
>
> Name Backend Integer Value
> =========================
> false 0
> true 2
> nil 4
> undef 6 (This isnt accessible from native ruby code, but is used
> internally)
>
> For symbols, the least significant byte is 0x0e and the upper 3 bytes
> are a integer. The integer is uniquely assigned value for a string.
> Think about it as the table index for a string. By using a symbol, you
> basically allocate a string once and then refer to it by the index it
> occupies in a special symbol table. For example, the first time the
> symbol :evan is seen, a string containing "evan" is created and stored
> in the symbol table at, say, index 9323. The variable that was assigned
> :evan gets assigned
> ((9223 << 8 ) | 0x0e). The next time :evan is seen, "evan" is looked up

This should obviusly be ((9232 << 8 ) | 0x0e)

> in the symbol table to obtain 9232 again.
>
> So, to review:
>
> a = :evan
>
> a.to_i # => 9232 (the index in the symbol table)
> a.object_id # => 2363406 (the index << 8 | 0x0e)
> a.to_s # => the reference to the string object located at
> index 9232 in the symbol table

This seems to create a copy each time. (at least if there is no ruby
string around)

> The ruby runtime rules take the integer value and apply the rules to
> determine what it means. If it's odd, it's a Fixnum immediate value. If
> it's 0,2,4, or 6, it's a "core" immediate value. If it has the LSB is
> 0x0e, it's a symbol. Otherwise, it's a pointer to a memory address that
> holds the information about the object.
>
> Thats the end of days leason! Hope it helps!
>
> Evan Webb // evan@fallingsnow.net

Thanks this may be realy helpfull for those who want to understand
symbols and have a decent idea how interpreters work.

cheers

Simon


Tim McNamara

11/11/2010 12:15:00 AM

0

In article
<05d8155c-bcad-4bb3-999b-9ce8ed04d827@h21g2000vbh.googlegroups.com>,
335 <335player@gmail.com> wrote:

> It's kind of ironic how some people on the left are more intolerant
> and cliche ridden in their views then the right wing tea partiers
> they prattle on and complain about all day long.

Only if you're not paying attention to the &%*# spouted by the right
wing every day, 24/7. Then you see who's really intolerant.

--
Gotta make it somehow on the dreams you still believe.

sheetsofsound

11/11/2010 1:06:00 AM

0

On Nov 10, 7:14 pm, Tim McNamara <tim...@bitstream.net> wrote:
> In article
> <05d8155c-bcad-4bb3-999b-9ce8ed04d...@h21g2000vbh.googlegroups.com>,
>
>  335 <335pla...@gmail.com> wrote:
> > It's kind of ironic how some people on the left are more intolerant
> > and cliche ridden in their views then the right wing tea partiers
> > they prattle on and complain about all day long.
>
> Only if you're not paying attention to the &%*# spouted by the right
> wing every day, 24/7.  Then you see who's really intolerant.

people notice what they've been trained to notice. I may be
predisposed to notice racism but if you're white and have never had a
gun pointed at your head by a southern sheriff or been thrown in jail
for traveling in the south with a black family in the '60s (as I have)
then you may not notice racism or understand why there needs to be
laws to ensure against bias in hiring, etc.

As to Obama, I dislike many things he's done. I'm not sure why he's
still got such a large presence in the middle east and happen to think
that it's a waste of time trying to catch Bin Laden and attempting to
promote democracy in a region which is unlikely to stay democratic
after we leave. I dislike his waffling on don't ask don't tell and
many other things. However, I'm not going to pretend I don't notice
when folks are predisposed to dislike because he's "not christian" or
thinks "islam is the greatest religion" or "wasn't born in the USA" or
"is not patriotic". All that is faux news bullsh@t.