Bartc
2/10/2016 4:55:00 PM
On 10/02/2016 15:24, David Brown wrote:
> On 10/02/16 15:07, BartC wrote:
>> From what I've been able to glean, PyPy is actually an interpreter (and
>> an even slower one than CPython), but written in such a way that
>> repeated execution paths (through the interpreter apparently, not the
>> bytecode being executed) are chosen to be JIT-compiled.
> PyPy is a Python interpreter with JIT acceleration. Where possible,
> Python code is used to generate C code that is then compiled (with gcc
> or clang), and called from the PyPy interpreter. Some Python code
> cannot sensibly be compiled in this way, and it gets interpreted.
I believe PyPy is written in RPython. That is, the interpreter for
normal Python is written in a special language called RPython, which
looks like Python but restricts those features that make it slow. That
RPython undergoes whole-program analysis to turn it into C.
So, we now still have a Python intepreter, but it's in C (a bit like the
normal CPython), but there is some extra magic in there to optimise
'hot' execution paths.
This is where it's unclear as to whether those paths are part of the
RPython interpreter, or part of the Python bytecode being executed.
I thought it was the former. But the end result is that the Python
program ends up faster (sometimes a magnitude or two faster), than
normal CPython. In which case, no will really care exactly how it works!
Unless someone is discussing whether the Python is being compiled or
interpreted...
>> You can tell from the language anyway that it is completely unsuitable
>> for normal 'straight' compilation to native code. So interpretation
>> would be the usual start-point.
> Certainly interpretation is the common way to handle Python code. And
> static compilation is going to be a lot less efficient than dynamic or
> JIT compilation. For example, consider the function:
>
> def average(it) :
> return sum(it) / len(it)
>
> Your code could call that function with lists of integers, tuples of
> floating point, a tree structure of matrix types, or anything else that
> supports an iteration interface and whose elements can be added and then
> divided by an integer. Clearly any general compiled version of that
> function is going to involve a lot of indirect lookup to get the
> required functions to operate on the types that happen to be passed to
> any particular call of the function.
>
> But if your code only ever calls the function with tuples of three
> floating point numbers, it can be reduced to a few assembly instructions
> to operate only on that particular type of argument. So a JIT compiled
> version is going to be a great deal smaller and faster than a statically
> compiled version could ever be.
The JIT compiler needs to find that out first. PyPy for example is good
at optimising loops, so if average() is called a million times from a
loop, each time with the same type for 'it', then that because a 'hot path'.
(Yours is not a good example however because most of the work will be
done inside sum(). If this is a built-in, then it will be executing
native code N times, not bytecode, even if it still needs to do dynamic
type-dispatch for each element.)
--
Bartc