[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Traversing through Dir

Andrej Mitrovic

3/26/2010 12:26:00 AM

I would like to traverse through the entire structure of dir(), and
write it to a file.

Now, if I try to write the contents of dir() to a file (via pickle), I
only get the top layer. So even if there are lists within the returned
list from dir(), they get written as a list of strings to the file.

Basically, I have an embedded and somewhat stripped version of Python.
I would like to find out just how much functionality it has (I have no
documentation for it), so I thought the best way to do that is
traverse thru the dir() call. Any clues as to how I could write the
whole structure to a file? I guess I'll need some kind of recursion
here. :)

3 Answers

Alf P. Steinbach

3/26/2010 8:18:00 AM

0

* Andrej Mitrovic:
> I would like to traverse through the entire structure of dir(), and
> write it to a file.
>
> Now, if I try to write the contents of dir() to a file (via pickle), I
> only get the top layer. So even if there are lists within the returned
> list from dir(), they get written as a list of strings to the file.
>
> Basically, I have an embedded and somewhat stripped version of Python.
> I would like to find out just how much functionality it has (I have no
> documentation for it), so I thought the best way to do that is
> traverse thru the dir() call. Any clues as to how I could write the
> whole structure to a file? I guess I'll need some kind of recursion
> here. :)

The built-in dir() function just produces a sequence of strings.

You can inspect the attributes via the getattr() function. The getattr()
function produces a reference to an object (as does every expression). Problem:
if you start with the number 42 and apply dir(), do getattr() on the first
string, apply dir() on that object, and so on, with CPython you then get into an
infinite recursion because those objects are produced on demand...


<code language="Py3">
obj = 42
obj_name = "42"
for n in range( 12 ):
where = id( obj )
t = type( obj ).__name__
print( "{:>2} {:>10} {:20} of type '{}'".format( n, where, obj_name, t ) )
attribute_names = dir( obj )
obj_name = attribute_names[0]
obj = getattr( obj, obj_name )
</code>


Similarly, if you do this with the Turtle module as starting point, with CPython
you get into a different kind of infinite recursion because the chain of
attributes so obtained is circular.


<code langauge="Py3">
import turtle

obj = turtle
obj_name = 'turtle'
for n in range( 12 ):
where = id( obj )
t = type( obj ).__name__
print( "{:>2} {:>10} {:20} of type '{}'".format( n, where, obj_name, t ) )
attribute_names = dir( obj )
obj_name = attribute_names[0]
obj = getattr( obj, obj_name )
</code>


It's not a clean, strict hierarchy of objects.

However, the basic idea is sound when you only want to obtain some limited,
known information, such as e.g. short descriptions of all string methods, or a
listing of the standard exception hierarchy.


<code language="Py3">
for attribute_name in dir( str ):
if attribute_name.startswith( "_" ):
pass
else:
attribute = getattr( str, attribute_name )
doc_string = attribute.__doc__
doc_lines = doc_string.splitlines()
if len( doc_lines ) > 2:
essential_doc = doc_lines[2]
else:
essential_doc = doc_lines[0]
print( attribute_name.ljust( 15 ) + essential_doc )
</code>


<code language="Py3">
"Lists the standard exception class hierarchy with short descriptions."
import builtins
import inspect

indentation = "." + 2*" "

def is_type( o ):
# Could use inspect.isclass for this, but in the DIY spirit:
return isinstance( o, type )

def beginning_of( s, max_chars ):
return s[:max_chars] # Not yet discussed, but doesn't matter.

def print_hierarchy( h, level ):
for o in h:
if isinstance( o, tuple ):
# o is a tuple describing a class
cls = o[0]
doc_lines = cls.__doc__.splitlines()
short_doc = beginning_of( doc_lines[0], 55 )
print( "{:<34} {}".format(
level*indentation + cls.__name__, short_doc
) )
else:
# o is a list array of subclasses
print_hierarchy( o, level + 1 )

classes = []
for name in dir( builtins ):
o = getattr( builtins, name )
if is_type( o ):
if issubclass( o, BaseException ):
classes.append( o )

hierarchy = inspect.getclasstree( classes )
# 'hierarchy' is a list array of tuples and nested list arrays of the same form.
# The top level is an array of two items, the first item a tuple describing the
'object'
# class, and the second item a list array representing the BaseException hierarchy.
print_hierarchy( hierarchy[1], level = 0 )
</code>


Cheers & hth.,

- Alf

Andrej Mitrovic

3/26/2010 2:17:00 PM

0

On Mar 26, 9:18 am, "Alf P. Steinbach" <al...@start.no> wrote:
> * Andrej Mitrovic:
>
> > I would like to traverse through the entire structure of dir(), and
> > write it to a file.
>
> > Now, if I try to write the contents of dir() to a file (via pickle), I
> > only get the top layer. So even if there are lists within the returned
> > list from dir(), they get written as a list of strings to the file.
>
> > Basically, I have an embedded and somewhat stripped version of Python.
> > I would like to find out just how much functionality it has (I have no
> > documentation for it), so I thought the best way to do that is
> > traverse thru the dir() call. Any clues as to how I could write the
> > whole structure to a file? I guess I'll need some kind of recursion
> > here. :)
>
> The built-in dir() function just produces a sequence of strings.
>
> You can inspect the attributes via the getattr() function. The getattr()
> function produces a reference to an object (as does every expression). Problem:
> if you start with the number 42 and apply dir(), do getattr() on the first
> string, apply dir() on that object, and so on, with CPython you then get into an
> infinite recursion because those objects are produced on demand...
>
> <code language="Py3">
> obj = 42
> obj_name = "42"
> for n in range( 12 ):
>      where = id( obj )
>      t = type( obj ).__name__
>      print( "{:>2} {:>10} {:20} of type '{}'".format( n, where, obj_name, t ) )
>      attribute_names = dir( obj )
>      obj_name = attribute_names[0]
>      obj = getattr( obj, obj_name )
> </code>
>
> Similarly, if you do this with the Turtle module as starting point, with CPython
> you get into a different kind of infinite recursion because the chain of
> attributes so obtained is circular.
>
> <code langauge="Py3">
> import turtle
>
> obj = turtle
> obj_name = 'turtle'
> for n in range( 12 ):
>      where = id( obj )
>      t = type( obj ).__name__
>      print( "{:>2} {:>10} {:20} of type '{}'".format( n, where, obj_name, t ) )
>      attribute_names = dir( obj )
>      obj_name = attribute_names[0]
>      obj = getattr( obj, obj_name )
> </code>
>
> It's not a clean, strict hierarchy of objects.
>
> However, the basic idea is sound when you only want to obtain some limited,
> known information, such as e.g. short descriptions of all string methods, or a
> listing of the standard exception hierarchy.
>
> <code language="Py3">
> for attribute_name in dir( str ):
>      if attribute_name.startswith( "_" ):
>          pass
>      else:
>          attribute = getattr( str, attribute_name )
>          doc_string = attribute.__doc__
>          doc_lines = doc_string.splitlines()
>          if len( doc_lines ) > 2:
>              essential_doc = doc_lines[2]
>          else:
>              essential_doc = doc_lines[0]
>          print( attribute_name.ljust( 15 ) + essential_doc )
> </code>
>
> <code language="Py3">
> "Lists the standard exception class hierarchy with short descriptions."
> import builtins
> import inspect
>
> indentation     = "." + 2*" "
>
> def is_type( o ):
>      # Could use inspect.isclass for this, but in the DIY spirit:
>      return isinstance( o, type )
>
> def beginning_of( s, max_chars ):
>      return s[:max_chars]    # Not yet discussed, but doesn't matter.
>
> def print_hierarchy( h, level ):
>      for o in h:
>          if isinstance( o, tuple ):
>              # o is a tuple describing a class
>              cls = o[0]
>              doc_lines = cls.__doc__.splitlines()
>              short_doc = beginning_of( doc_lines[0], 55 )
>              print( "{:<34} {}".format(
>                  level*indentation + cls.__name__, short_doc
>                  ) )
>          else:
>              # o is a list array of subclasses
>              print_hierarchy( o, level + 1 )
>
> classes = []
> for name in dir( builtins ):
>      o = getattr( builtins, name )
>      if is_type( o ):
>          if issubclass( o, BaseException ):
>              classes.append( o )
>
> hierarchy = inspect.getclasstree( classes )
> # 'hierarchy' is a list array of tuples and nested list arrays of the same form.
> # The top level is an array of two items, the first item a tuple describing the
> 'object'
> # class, and the second item a list array representing the BaseException hierarchy.
> print_hierarchy( hierarchy[1], level = 0 )
> </code>
>
> Cheers & hth.,
>
> - Alf

Thanks for all of that. And yes, I've noticed I get into infinite
recursions all the time, which is why I was asking if there was a
simple way to do this. I'll have a look at these later.

Kind regards,
Andrej Mitrovic

Dave Angel

3/27/2010 10:19:00 AM

0



Andrej Mitrovic wrote:
> On Mar 26, 9:18 am, "Alf P. Steinbach" <al...@start.no> wrote:
>
>> <snip>
>> hierarchy =nspect.getclasstree( classes )
>> # 'hierarchy' is a list array of tuples and nested list arrays of the same form.
>> # The top level is an array of two items, the first item a tuple describing the
>> 'object'
>> # class, and the second item a list array representing the BaseException hierarchy.
>> print_hierarchy( hierarchy[1], level = )
>> </code>
>>
>> Cheers & hth.,
>>
>> - Alf
>>
>
> Thanks for all of that. And yes, I've noticed I get into infinite
> recursions all the time, which is why I was asking if there was a
> simple way to do this. I'll have a look at these later.
>
> Kind regards,
> Andrej Mitrovic
>
>
I can't help with the original question, but generally the cure for
getting into infinite recursion when traversing a tree with multiple
connections is to keep a set() of all visited nodes. Whenever you hit a
node a second time, don't visit it or its dependents. It's not hard to
add to otherwise working code, and frequently the easiest place is right
at the beginning of the recursive function.
if new_node in myset:
return
myset.add(new_node)
...process_this_node...
for child in new_node:
...recurse...

HTH
DaveA