Does the compiler know best?

Ted Dziuba recently blogged about Python3’s Marketing Problem. I chimed in on the comment thread, but there was a deeper point that I felt is missed in the discussions about the GIL and PyPy and performance.  Lately I’ve seen more and more people expressing sentiments along the lines of:

I’m of the same mind, but think that instead of offering a GIL fix, the goodie should have been switching over to PyPy. That would have sold even more people on it than GIL removal, I think.

I know it is an unpopular opinion, but somebody’s got to say it: PyPy is an even more drastic change to the Python language than Python3. It’s not even a silver bullet for performance. I believe that its core principles are, in fact, antithetical to the very things that have brought Python its current success. This is not to say that it’s not an interesting project. But I really, really feel that there needs to be a visible counter to the meme that “PyPy is the future of Python performance”.

What is the core precept of PyPy? It’s that “the compiler knows best”. Whether it’s JIT hotspot optimization, or using STM to manage concurrency, the application writer, in principle, should not have to be bothered with mundane details like how the computer actually executes instructions, or which instructions it’s executing, or how memory is accessed. The compiler knows best.

Conversely, one of the core strengths of Python has been that it talks to everybody, because its inner workings are so simple. Not only is it used heavily by folks of all stripes to integrate legacy libraries, but it’s also very popular as an embedded scripting system in a great number of applications. It is starting to dominate on the backend and the front-end in the computer graphics industry, and hedge funds are starting to converge on it as the best language to layer on top of their low-level finance libraries.

If you doubt that transparency is a major feature, you simply have to look at the amount of hand-wringing that JVM folks do about “being hit by the GC” to understand that there, but by the grace of Guido, go we. If we have to give up
ease of embedding and interoperability, and visibility into what the running system is doing, for a little improvement in performance, then the cost is too steep.

It’s understandable that those who see Python as merely a runtime for some web app request handlers will have a singular fixation with “automagically” getting more performance (JIT) and concurrency (STM) from their runtime. I never thought I’d say this, but… for those things, just fucking use Node.js. Build a Python-to-JS cross compiler and use a runtime that was designed to be concurrent, sandboxed, lightweight, and has the full force of Google, Mozilla, Apple, and MSFT behind optimizing its performance across all hardware types. (It would not surprise me one bit if V8+NaCl finally became what the CLR/DLR could have been.) Armin and the PyPY team are incredibly talented, and I think Nick is probably right when he says that nobody has more insight and experience with optimizing Python execution than Armin.

But even Armin has essentially conceded that optimizing Python really requires optimization at a lower level, which is why PyPy is a meta-tracing JIT. However, PyPy has made the irreversible architectural decision that that level should be merely an opaque implementation detail; the compiler knows best.

An alternative view is that language runtimes should be layered, but always transparent.

Given the recent massive increase of commercial investment in LLVM, and the existence of tools in that ecosystem like DragonEgg, syntax really ceases to be a lock-in feature of a language. (Yes, I know that sounds counter-intuitive.) Instead, what matters more is a runtime’s ability to play nicely with others, and of course its stable of libraries which idiomatically use that runtime. Python could be that runtime. Its standard library could become the equivalent of a dynamic language libc.

Python gained popularity in its first decade because it was a non-write-only Perl, and it worked well with C. It exploded in popularity in its second decade because it was more portable than Java, and because the AMD-Intel led to spectacular improvements in CPU performance, so that an interpreted language was fast enough for most things. For Python to emerge from its third decade as the dynamic language of choice, its core developers and the wider developer community/family will have to make wise, pragmatic choices about what the core strengths of Python are, and what things are best left to others.

View in this light, stressing Unicode over mere performance is a very justifiable decision that will yield far-reaching, long term returns for the language. (FWIW, this is also why I keep trolling Guido about better DSL support in Python; “playing nicely with others” in a post-LLVM world means syntax interop, as well.)

The good news is that the python core developers have been consistently great at making pragmatic choices. One new challenge is that the blogosphere/twittersphere has a logic unto itself, and can lead to very distracting, low signal-to-noise ratio firestorms over nothing. (fib(), anyone?) Will Python survive the noise- and gossip-mill of the modern software bazaar? Only time will tell…

Tagged

10 thoughts on “Does the compiler know best?

  1. While I agree with a lot of what you write here, the one point I’ll disagree with is the general tone of seeing CPython vs PyPy as an either/or decision for the overall Python ecosystem.

    The ecosystem as a whole benefits from *both* of them existing. With PyPy focusing on the “long running Python application” use case, that leaves CPython free to specialise in the original “system integration glue language” use case. And yes, there are plenty of things we can still do to improve on that front.

    PyPy’s existence also helps up clean up the distinctions between Python-the-language-specification and CPython-the-reference-implementation – ports to PyPy have revealed a few long-hidden bugs where CPython doesn’t actually abide by the language spec. Jython/IronPython were less effective at that task, since many discrepancies were chalked up to the underlying platforms. Since CPython and PyPy both use purpose built Python VMs, they act as excellent cross-checks on the language specification.

    I’ve been asked a few times if I think PyPy will ever displace CPython as the reference interpreter, and my answer is always “No.” There are plenty of things about PyPy that are awesome, and its existence is a huge gain for the overall ecosystem, but the goals and communities are too different for it to ever make sense for the reference interpreter role to switch over.

    There sometimes seems to be a misguided desire for “one implementation to rule them all” – the ability to say “this interpreter is the right choice for every situation”. It’s simply never going to happen – the scope of software development is too diverse for that.

    We do need to reach a point where “Why aren’t you using PyPy?” can be answered with “CPython/IronPython/Jython is a better fit for my use case” without anyone getting huffy and indignant. However, suggesting that Python developers that happen to have use cases where PyPy *is* a good fit should become JavaScript developers instead is unhelpful and entirely unnecessary.

    This isn’t a zero-sum game where gains by one VM necessarily mean losses for another. The developers of the various implementations collaborate on the language specification and the standard library, and there’s nothing stopping users of the various implementations collaborating on shared interests, too.

    • Peter says:

      You’re right, Nick. There is plenty of space in the Python ecosystem for multiple implementations of the language, and I think it’s healthy. I just got tired of seeing a large number of people expressing this particular sentiment, without recognizing that there is deep value in what CPython provides.

      I’m pretty confident that we all have but imperfect and limited visibility into all the use cases of something as popular as Python, but I feel there is not enough advocacy for those who value the C-level integration that Python offers. Honestly I still don’t know how well we are stacking up against Lua in that department. It had tremendous momentum for a while (and of course is still massively deployed due to World of Warcraft), but according to TIOBE it’s dropped in popularity recently?…

      • As far as I know, Lua is still an excellent choice as an embedded scripting engine for a larger application like the WoW client (as its relatively easy to sandbox), but I’m not aware of any significant use as a system integration or application development language.

  2. Sho says:

    Your paragraph on Node.js/V8/NaCl reads like dangerous half-knowledge to me, or at least is rather sloppy. Google, Mozilla, Apple, and MSFT do not in fact cooperate on a single JavaScript runtime. Mozilla and Opera have come out against NaCl. Further, NaCl is a sandboxing technology for native code; it sits below either CPython or PyPy in the execution environment.

    This paragraph also reads rather confused to me:

    “But even Armin has essentially conceded that optimizing Python really requires optimization at a lower level, which is why PyPy is a meta-tracing JIT. However, PyPy has made the irreversible architectural decision that that level should be merely an opaque implementation detail; the compiler knows best.”

    You seem to propose that “optimizing Python requires optimization at a lower level [than the language]” by making a point of Armin agreeing with that. Then why do you fault PyPy for making the low-level optimization strategy an implementation detail? That seems to be entirely in line with what you two are agreeing on: Low-level optimizations. I think what you’re actually trying to say is that performance requires active, manual optimization of minutae by the programmer, but you’re not stating this clearly, you’re not providing any arguments to substantiate that claim, and I don’t think that’s actually something Armin would succeed on a statistical (i.e. average non-trivial project scope) level.

    There’s other stuff there, like insinuating but in no way substantiating that PyPy has a less intuitive GC behavior than CPython. Without substantiation that is just FUD, not a credible technical argument comparing the two interpreters.

    I have no doubt that you mean well, and you may in fact be well-informed and have interesting points to raise, but this post does not succeed at all in making them understood. Your paragraphs don’t seem to lead into each other well, and I can’t really confidently follow your line of reasoning. But I’d be interested in a cleaned-up v2.0.

    • Sho says:

      Whoops, s/succeed/concede/.

    • Peter says:

      “You seem to propose that “optimizing Python requires optimization at a lower level [than the language]” by making a point of Armin agreeing with that. Then why do you fault PyPy for making the low-level optimization strategy an implementation detail? That seems to be entirely in line with what you two are agreeing on: Low-level optimizations.”

      I think I answer this in the very next line of my original post: language runtimes should be layered and transparent. That is, one should be able to drive the higher layers from lower layers, not merely treat each successive lower layer as an opaque virtual box. This requires that the higher layers reify their runtime state and expose an API for lower levels to interoperate with. You can see this with CPython; it provides a C-API which external modules can use to create new objects and data structures in the higher-level runtime. *This is indescribably powerful.*

      This is also why I chose the title I did for my blog post. A compiler-centric view of the world sees the “low level” as merely components to be optimized over. It does not appreciate that no matter how smart your compiler, there may be users of your language who know more about their algorithms and data structures than you did when you wrote the compiler, and that they should be provided a mechanism for speaking with the lower level.

      People who actually live and breathe performance optimization know that at the end of the day, the goal is to stream as much data through as simplified a set of instructions as possible. By providing descriptors of data that can bubble up and down the layers of runtime complexity, you minimize redundant memory copies and maximize the opportunities for algorithmic efficiency (which is usually captured at the highest levels) to execute over as large a swath of data as possible. Giving programmatic access to those descriptors is what defines and differentiates a runtime from merely an optimizing compiler.

      “I think what you’re actually trying to say is that performance requires active, manual optimization of minutae by the programmer..”

      That is a false dichotomy. One does not have to trade the expressiveness or “high level”-ness of the code for performance. Tending to minutae is only one possible way to optimize performance. Another way, which seems to be lost on most people who are not game programmers or High-Performance Computing programmers, is to structure your data and your memory access so that relatively simple programs can operate on them.

      “Your paragraph on Node.js/V8/NaCl reads like dangerous half-knowledge to me, or at least is rather sloppy”

      Point taken – it is a bit sloppy. I did not mean to imply they were all cooperating on a single runtime. However, they are all competing to make the fastest, most robust runtime for a particular dynamic language – and that language is not RPython.

      The aside about NaCl is because I always wanted to see the DLR succeed as a concept. I think languages *should* mostly be syntactic sugar in support of particular idioms or paradigms, and innovation on optimization and construction of low-level routines should be reused as much as possible. However, given MSFT’s fixation with Windows, there was no chance that DLR was ever going to reach parity on Linux or Mac with its performance on Windows.

      NaCl is a sandboxing technology, but with Pepper, I believe one can use it as an interop mechanism for C implementations of languages which were never designed to be used with each other. My comment about V8+NaCl is that if someone slaps a “fast enough”/”extensible enough” dynamic runtime on top of an ABI abstraction system, then languages hosted on that runtime can call through to functions in extension modules for each other, without marshalling or incurring other IPC costs. This was the selling point for the DLR, *if* your language could be implemented in .Net. NaCl just requires a recompile of your C code.

      “There’s other stuff there, like insinuating but in no way substantiating that PyPy has a less intuitive GC behavior than CPython. Without substantiation that is just FUD.”

      No, I was talking about the JVM, not PyPy. And I *definitely* have heard more Java people worrying about when they’re going to get hit by the GC than Python people worrying about cycle detection cleanup.

      “Your paragraphs don’t seem to lead into each other well, and I can’t really confidently follow your line of reasoning. But I’d be interested in a cleaned-up v2.0.”

      I’m sure I will get plenty of feedback, which I will probably have to roll into a v2. (Honestly, I sent this around to some friends for review and one of them tweeted it before I really meant for it to circulate more widely. Lesson learned! :-)

      • Romain says:

        “However, they are all competing to make the fastest, most robust runtime for a particular dynamic language – and that language is not RPython.”

        PyPy is a runtime for Python code not RPython code, RPython is the equivalent of C++ in V8’s case

  3. Luis says:

    Your suggestion of cross compiling python to js is not realistic. Others have tried but the result is unsatisfactory performance wise.
    And although you can implement 90% of the language relatively easy, “the other 90%” would be painful to say the list (see Skulp).
    What can be done instead is creating an alternative syntax for js, inspired in python.
    It’s been done already. It’s Coffeescript.
    It takes a lot from python (and Ruby) but it’s not python. It’s js in disguise.

    Pouring syntactic sugar over js is not difficult.
    Implementing a whole different language with different data types and object system is another story.

    So ditching Pypy with your argument is not fair.
    Pypy is already competitive against v8, even though it is not supported by a large company.
    It won’t run in the browser for political reasons but it can take its place on the server side.
    Actually, Pypy on the server and Coffeescript in the browser sounds good to me.

    For other environments and uses, python, ironpython, jython or whatever python is ok.

    So, what was the problem?

    • Peter says:

      “So, what was the problem?”

      No problem. I have always maintained that PyPy is an interesting project and a great thing to have in the Python ecosystem. Although a facile read of my blog and comment history on the subject might suggest otherwise, what I see as a problem – and the reason why I wrote this blog post – was because I keep seeing people talk about PyPy as a replacement for existing technology, and for all the wrong reasons. Last November when they first raised funds for the “Numpy on PyPy” stuff, and claimed to be able to reproduce the core of Numpy in 3 months, I wrote some posts about what they were missing about what Numpy *actually* is.

      Likewise, I have seen people repeatedly talk about PyPy as if it could be a replacement for CPython as the “primary” implementation, and the entire point of my post is that (1) there is a huge feature of CPython that that contingent is missing, and (2) if they are really so desperate for performance from a dynamic language, there are many viable routes that don’t involve throwing out a whole gaggle of babies with the bathwater.

      Having multiple language implementations is definitely a strength. There are some cool things being worked on at PyPy which I think can have positive impacts on the whole ecosystem. But to my knowledge, there was never nearly as loud a clamor to replace CPython with Jython or IronPython as there has been recently with PyPy. In my analysis, a large part of this is specifically due to the performance needs of people who write web application servers, and who want to use Python as a low-latency, highly parallelizable glue language between web services and distributed persistence engines. Hence, my Node.js comment. (And I honestly am no huge fan of Node, but I find its rapid rise in adoption a curious thing…)

  4. […] This is a response to Alex Gaynor’s response  to my previous post, “Does the Compiler Know Best?” […]

Leave a reply to Sho Cancel reply