If anyone is curious, one of the fastest multi-threaded queue implementations ou...

SQueeeeeL · on Aug 21, 2020

Jonathan Blow warns against threaded queues in game development, as normally simulating your world isn't the bottleneck (rendering is) and it will just cause a fair bit of unexpected behavior/debugging

zemo · on Aug 21, 2020

statements like this really need to be put into context. Maybe that's true for his games, but it's not necessarily true for all games. His latest game, The Witness, is a first-person puzzle game with zero non-player characters, no kinematics, and a variety of puzzles based around light and rendering. He designs games that don't have much to simulate and do have complicated rendering situations.

Meanwhile, Doom Eternal has no main or render thread at all, and instead uses a massively parallel jobs system. https://www.dsogaming.com/news/doom-eternal-does-not-have-a-...

SQueeeeeL · on Aug 21, 2020

I think the majority of game devs who would get advice from lectures/hacker news comments are probably making games on a small enough scope/scale that choosing a single threaded game logic engine is fairly reasonable. The people at Bethesda are the best of the best; this kind of reminds me of the fitness "advice" I was once given to not go on long jogs/runs because "all the best marathon runners are super skinny", but that only applies to world class marathoners, not dudes running a 5k on the weekend.

zemo · on Aug 21, 2020

people who work in games also read HN you know.

MaulingMonkey · on Aug 21, 2020

And said AAA devs aren't "ubermensch". They run a large gamut of specialties and skillsets - plenty of them will benefit from extra context.

And some of those small scale hobby/indie devs may later end up working for "the big leagues" as well, so the extra context can benefit them too.

SQueeeeeL · on Aug 21, 2020

tbf I said take advice from HN. I except people deeply involved in games to have their own specialized info us mortals can't touch (or they just do their own testing)

corysama · on Aug 21, 2020

I'd appreciate a link so I could hear his whole argument.

You can argue about when it is appropriate to use threads at all. But, if I'm going to use threads, I use a threaded queue for communication exclusively.

cridenour · on Aug 21, 2020

I wish X4 would go this route though, as it is entirely bottle necked by simulation speed.

SQueeeeeL · on Aug 21, 2020

I imagine execution order/consistency is very valuable for 4X games, and most of the time, results are dependent on each other (who wins a battle may depend on the current status of an empire, which is dependent on the outcome of various planet level actions, for example). It'd probably be a very different game to have each action be stateless, could be a cool exercise though

0xffff2 · on Aug 21, 2020

Note that despite using the same two characters, X4 is definitely not a 4X game.

SQueeeeeL · on Aug 21, 2020

Whoops, my brain just ran right through the word. Looking at X4, it still looks like an incredibly busy game with a very busy gamestate

winrid · on Aug 22, 2020

In some cases you have to. I'm working on a game and absolutely need separate threads for animation, rendering, and constructing meshes (it's kind of procedural - is part 3d map renderer).

TwoBit · on Aug 21, 2020

Sim games (eg SimCity, Sims) could spend more time on sim than graphics.

Thaxll · on Aug 21, 2020

Indeed, lot of games are actually single threaded.

dkersten · on Aug 21, 2020

In recent years there has been a trend shift away from this, at least in the AAA engines, towards a job system. This makes sense: you have a thread per core and you create jobs to “go wide” when you can. See for example Unity’s Job system it’s the GDC talk by Naughty Dog from a few years ago.

The big games will also prepare data for rendering in parallel (eg culling and sorting and whatnot, although much if this is also done on the GPU).

(Going by GDC talks, the rendering teardown articles and just what I see online from Unity/Unreal. I don’t work in games myself)

asdfasgasdgasdg · on Aug 21, 2020

Which makes total sense because single thread performance is growing more slowly these days. Used to be you'd double every couple of years but today the midrange is only about 50% faster single threaded than it was in '16. Now if you count all the cores you're still seeing things more than double over that time. Compare these similarly priced CPUs from today and a few years ago: i5-6500 and i5-10500. The latter is maybe 30-40% faster single threaded but has more than double the parallel throughput.

Thaxll · on Aug 21, 2020

It's true but lot of the main area don't multi-thread well like AI and physics.

asdfasgasdgasdg · on Aug 21, 2020

I don't know why AI shouldn't thread well, assuming there is more than one actor. As long as they are operating over an immutable view of the game state, each actor should be able to plan independently and enter its commands independently. Likewise, there are probably some tricks you can do with physics. And anyway in most games interactive physics is only done for a few objects in the game world, and those objects are often not interacting with each other, at least not physically. You could cluster the objects that can affect each other and then do each of them single-threaded.

corysama · on Aug 21, 2020

> As long as they are operating over an immutable view of the game state

That's a big issue. There is a surprising amount of back and forth between objects in a single step of gameplay/AI.

And, generally gameplay code tends to be a big mess of wild and ever-changing requirements from gameplay designers, extreme time crunch and short term (1 game then burn it) goals. Ivory tower software architecture it is not...

Clustering physics into "islands" is common practice though.

asdfasgasdgasdg · on Aug 22, 2020

I'm not a game dev but I do know how software can become a mess. I think engines that are used by many games have a chance to push good practices here.

gameswithgo · on Aug 21, 2020

I mean, not entirely, though the game logic often will be. But usually there is a fair bit of threading going on, and things that are threaded in the engine, or graphics card driver.

jcelerier · on Aug 21, 2020

would be nice to see updated benchmarks against the C++ queues - this LMAX queue seems to give 20-25 million messages per second on sandy bridge - the best 1P/1C C++ queue is around 250 million messages per second on a 9900K and I doubt a 9900K is 10 times more performant than a 2600K.

> https://max0x7ba.github.io/atomic_queue/html/benchmarks.html

bob1029 · on Aug 21, 2020

I am more concerned about worst case latency than going into message rates measured in the billions per second. The load those events will generate far exceeds the load incurred by creating and processing them on the same physical host, so ill never be in a situation where 25 vs 250 million makes a difference.

I am also interested in the productivity and safety afforded by high-level languages in this arena. Dealing with memory and threading at the same time is not something I like to do at a low-level.

newobj · on Aug 21, 2020

LMAX is optimizing latency, not throughput.

exhaze · on Aug 21, 2020

Can you give more context about your projects i.e. what makes them require a super high-performance queue?

bob1029 · on Aug 21, 2020

The type of project I am using this for is a centralized client/server UI architecture where 100% of user events are submitted to the queue for processing. This allows for very high throughput user interfaces if you are doing clever things on the server WRT caching of prior-generated content for other events (i.e. all login attempts for the same region will get the same final view).

I found the abstraction this was originally developed for - processing of financial transactions with latency as the primary constraint - as an excellent analogue for UI event processing. Latency is also a huge concern when the user's eyeballs are in the loop.

cma · on Aug 21, 2020

And that use Java...

dkersten · on Aug 21, 2020

If you control object pools yourself and don’t use GC, as the LMAX disrupted does as far as I remember, Java can be blazingly fast.

corysama · on Aug 21, 2020

Martin Thompson has basically made a career out of writing Java in the style of embedded C because he found enterprise customers that need the performance of embedded C but, being enterprise, insist on absolutely everything being Java.

afiori · on Aug 23, 2020

calling external code from Java adds latency.

cma · on Aug 21, 2020

You have to make sure all dependencies don't use GC as well right?

dkersten · on Aug 21, 2020

Sure, but I'm assuming that if you're writing such high performance limited scope software like the LMAX disruptor, you have few dependencies (looking at their code, it appears that the disruptor code itself has no external dependencies and uses few of the standard library classes outside of NIO bytebuffers).

MarkyC4 · on Aug 21, 2020

in LMAX-disruptor's case, they have no runtime dependencies: https://github.com/LMAX-Exchange/disruptor/blob/master/build...

griffiths · on Aug 21, 2020

Are you using https://github.com/disruptor-net/Disruptor-net library port or something else?

bob1029 · on Aug 21, 2020

This is exactly what I am using.

pjmlp · on Aug 21, 2020

Just curious, are you also making use of Span and Pipelines?

bob1029 · on Aug 21, 2020

I haven't made much use of Span directly, but I do like using Pipelines for copying streams to other streams (i.e. building AspNetCore proxy abstractions).

pjmlp · on Aug 21, 2020

Thanks!