In a similar vein, see this page about the performance of the interpreter for the dynamic language Wren: https://wren.io/performance.html
Unlike the Zef article, which describes implementation techniques, the Wren page also shows ways in which language design can contribute to performance.
In particular, Wren gives up dynamic object shapes, which enables copy-down inheritance and substantially simplifies (and hence accelerates) method lookup. Personally I think that’s a good trade-off - how often have you really needed to add a method to a class after construction?
That’s basically what is done all the time in languages where monkey patching is accepted as idiomatic, notably Ruby. Ruby is not known for its speed-first mindset though.
On the other side, having a type holding a closed set of applicable functions is somehow questioning.
There are languages out there that allows to define arbitrary functions and then use them as a methods with dot notation on any variable matching the type of the first argument, including Nim (with macros), Scala (with implicit classes and type classes), Kotlin (with extension functions) and Rust (with traits).
It is getting better, now that they finally got the Smalltalk lessons from 1984.
"Efficient implementation of the smalltalk-80 system"
Yes, language design is a hugely important determinant of interpreter or JIT speed. There are many highly optimised VMs for dynamic languages but LuaJIT is king because Lua is such a small and suitable language, and although it does have a couple difficult to optimise features, they are few enough that you can expend the effort. It's nothing like Python. It's not much of an exaggeration to say Python is designed to minimise the possibility of a fast JIT, with compounding layers of dynamism. After years of work, the CPython 3.15 JIT finally managed ~5% faster than the stock interpreter on x86_64.
CPython current state is more a reflection of resources spent, than what is possible.
See experience with Smalltalk and Self, where everything is dynamic dispatch, everything is an object, in a live image that can be monkey patched at any given second.
PyPy and GraalPy, and the oldie IronPython, are much better experiences than where CPython currently stands on.
Python is worse, but not by all that much. After all, PyPy has been several times faster for many years.