Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And to illustrate the point more, I have found that -Os often produces faster code than -O3. Not by factor 10, but still clearly measurable.


Getting better performance with -Os over -O is some kind of edge case and people shouldn't generally expect that. -Os has disastrous consequences for C++ algorithms because it refuses to inline functors, so while you may rightly believe that a C++ std::sort is slightly faster than a C qsort, due to superior opportunities to optimize the C++ code, with -Os you'll find that std::sort is an order of magnitude slower. Definitely pays to check the result with a full-scale benchmark.


Interesting claim. I had to try this, and with the Xcode version that I have installed, std::sort with a simple lambda as comparison function gives about the same result with -O3 and -Os. I would not call this a disastrous, but of course opinions differ. Interestingly, qsort is significantly faster with -O3 than with -Os but of course nowhere near std::sort.


Results will vary for small vs. large programs. I've seen catastrophic space optimizations that out-of-lined very small methods like std::vector::at, because the call was one or two bytes smaller than the inline. Lambdas are inlined even with Os because they don't have names or multiple callers and can't be made smaller by out-of-lining. A functor class, or any function with multiple call sites could trigger the problems with Os.


Ok, I'm not continuing with this. Just note that I didn't write that -Os would be always or even usually faster. I could try to come up with an example where -O3 produces a huge loop preamble for loop that's iterated once or just generates enough cache misses to be overall slower, but I don't care enough.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: