so my latest thought is the sempiternal “hey can’t we just inline everything?”
Some justifications:
some of the most expensive Ruby calls made are:
“three most expensive methods were rb_eval, rb_call0, and rb_call"[1] — and they are somewhat expensive–they have a long method for themselves in the C code
Also note this:
@a = 0
def go
@a += 1
end
>> Benchmark.measure { 300000.times { go} }
=> Benchmark::Tms:0x3435138 @real=0.64>
>> Benchmark.measure { 300000.times { @a += 1}}
=> Benchmark::Tms:0x3431580 @real=0.40>
Anyway so if you can inline a function then it’s faster–for whatever reason.
It may be possible to speedup other constructs, like passed blocks, even. Just turn it into one huge method with lots of variables.
blocks are slow’ish:
def yieler; yield; end
>> Benchmark.measure { 300_000.times{ yieler { @a += 1} }}
=> Benchmark::Tms:0x342dad0 @real=1.5
You also might be able to optimize out unnecessary named blocks [see ex in rails recently [2]].
notes:
[1] http://kfahlgren.com/blog/ — also notes some speedup from compiler parameters [heh].
[2] http://blog.pluron.com/2008/02/rails-faster-as.html
Stefan Kaes inlined some rhtml templates with some good results: http://railsexpress.de/blog/articles/2006/08/15/rails-template-optimizer-beta-test [though it's mostly used for routes].
ludicrous does something–wonder what it’s relationship would be to this project. Could they be combined?
You might could specify expected class types for input parameter when you ‘inline freeze’ a method–then you might could inline calls to parameters. Not sure if it would help since you’d have to call instance.instance_variable_get and instance.instance_variable_set for instance variables, but maybe it would help speed-wise.
Inlining methods might have a chance to make them more compatible with ruby2c, too [or ludicrous].
UnifiedRuby does some re-writes. http://rewrite.rubyforge.org/ does, too.
This would help for 1.8.6 but what about ‘real’ Ruby — 1.9? What then? [Maybe ruby_parser plus an ability to find where in the code functions were defined--or translate iseq's back to ruby?? Maybe run it in 1.8.6, have it create optimized methods for you, use them
].
What other bottlenecks then present themselves after this?
How far to roll out recursive calls? [mostly for the alioth benchmarks to help them along].
May be able to translate to faster constructs: ex: for in instead of .each{} — whichever’s faster.
Note also someone said that compiling with arch flags helps. Huh?