I cannot claim to be very experienced, but it seems to me that the simulation could be given last-agreed-upon (LAU) player positions, and firing states, including the local player. The local player would see their screen move fluently, but the underlying simulation would only "see" the player in last-agreed-upon position. That would keep all of the monster simulation in sync. However, this would cause local lag in weapon firing/door opening. But, that may be acceptable to some degree (200ms to see your weapon get fired, or a door opened).
The acceptability of that kind of lag depends upon your point of view. In coop, it might be OK for lifts to be lagged, but in a fast-paced 1v1, the disadvantage is significant enough to be completely unfair. Ditto for stuff like rockets/plasma/weapon pickups, etc.
FWIW, this is how Oda/ST/ZD work right now. The server sends "LAU" positions for everything once in a while, and then the client re-predicts its local position from there (Server: Your position at TIC 1240 was x/y/z; Client: OK, I am at TIC 1245 so set my position to x/y/z and run 5 more commands until I catch back up).
The player would "feel" like they were moving fluently, and the screen would render fluent movement, but the underlying simulation would see the player move in chunks. Player movement prediction would show a fluid player on other clients, but their simulation would use the agreed-upon position.
This is also kind of how Oda/ST/ZD work as well. Clients send their commands to the server, but they don't necessarily arrive at the proper timing; i.e., the client may send commands 1241, 1242, 1243 and 1244 28.5ms apart, but the server may receive them all at TIC 1246. You have a choice at that point: buffer the commands and run 1 per TIC, run them all, or run them at some other rate (2 per TIC or whatever).
The decision you make doesn't make a huge difference to the sending client because of clientside prediction, but if you run 4 commands for a player, every other player is going to see that player move 4x as fast... unless you implement some kind of smoothing... which is by definition inaccurate.
It makes no difference if a monster sees you at tic 1200, or at tic 1204.
Ohhhh but it does. For example, a lot of the simulation depends on the value of "leveltime", if I run TIC 1241 with a different leveltime value than the server, desyncs are highly likely, and without something to reconcile those desyncs (you can get lucky with state/position updates... but at some point a "full" update will be required) the client will be in an irrecoverable state. There are very few things that don't impact the game state, which is why I'm skeptical of PVS filtering. Like Dr. Sean said, it's most useful for coop, but I would have to think pretty hard about it... which I haven't done.
I think it is vital for the simulations to stay in sync at all costs, especially for coop 100+ monster slugfests. You can fake the client view pretty well, but you can't run the simulation backwards, unless you store this massive history. I don;t think you have to.
You don't necessarily have to, so long as you perfectly exempt the game simulation during clientside prediction and perfectly update the simulation based on server messages. In practice, this is very hard to ensure.
For example, let's say the server sends the message that an Imp fired a fireball and a Sargeant moved into its path at TIC 1242.
First of all, it's not a given that clients & servers receive messages, because UDP doesn't guarantee delivery. I can make those messages "reliable" by waiting for the client/server to acknowledge receipt, and resending those messages if I don't get that... but how long do I wait? Your average ping (round-trip) in North America is something like 60-70ms, almost 3 TICs. So if the client doesn't ACK the fireball message by TIC 1245, I have to resend it, and then wait until TIC 1248, and so on. You can see how latency can pile up, especially for connections that are already bad.
Back to our example. Let's say for some reason I drop the "Sargeant moved" message and the fireball continues to hurtle towards me. Desync! The question now is how the netcode resolves the problem. There are usually messages to damage/kill/remove actors and set their targets, so (unless I drop those too), this desync probably won't be too bad: I'll get a message that the Sargeant was damaged, that its target changes to the Imp (infighting!!!) and that the fireball has been removed from the game. But if I miss even one of those, the game simulation will start to spiral out of control. This desync resolution strategy is clearly not ideal.
You may argue that dropping messages is rare and netcode shouldn't be expected to handle something as ridiculous as data just disappearing into the Internet. The reality is that packets drop even on wired LANs, and netcode has to deal with it, or suffer the consequences of desyncs. Furthermore, Doom has become international, and players are often connecting to each other overseas or through cellular and satellite connections. Will their experience be degraded? Probably. But as a netcode developer, you ignore them at the peril of your port's popularity.
In most each case, you can hide latency:
Player movement: Fake local player view, use LAU mobj position for local and remote simulation.
Weapons: Each non-hitscan weapon has a spin-up time - could hide some latency there.
Doors: Some lag in door opening can be reasonable (< 200 ms ok?)
Hitscan weapons are the biggest issue, but, again, some lag is acceptable. Force hitscans to be agreed upon, and the simulation stays in sync. Regrettable to be affected by lag here, but gun flash could hide a small amount of latency.
Latency hiding is an interesting topic, and there is some stuff on the Internet about it. Doom can't really use it though, and I'll illustrate. When you fire the SSG, your local client (assuming it implements clientside prediction) starts going through the motions of firing: it moves your player sprite through the firing animation, makes the firing sound, maybe even spawns the bullet puffs or blood spots on its own (without waiting for the server). It then waits for the server for the reaction of anything hit by your pellets, damage calculation, line/monster activation, etc.
Latency hiding here would be the server sending everyone the message that you fired before you actually fire. Doom can't do this because the server also has to move your player through the firing frames. It can't just send everyone a message [Player 2 fired the SSG at TIC 1243] based on your command at TIC 1241, because you might move or die between now and then.
On a LAN, there's no reason for the simulation to ever go out of sync. Those issues should be corrected 100% before attempting to hide lag. And, I believe that, at reasonable latency, that can be hidden without having to correct clients.
Yeah a LAN is an ideal situation, but things can still go wrong. Old machines can experience CPU lag on complex maps, etc. In fairness, there's not a lot you can do about that.
What could be interesting is to implement both approaches: First, strive to keep all clients in sync at all times. But, if a desync does occur, use your method to maintain recent saves + applied deltas, to get the client up and running quickly. And, the save+delta approach allows new clients to join midgame, with a minimum distraction.
Well, clients have to know they've desync'd, which is pretty hard (Server: [Remove fireball]; Client: Shit, what fireball... OK now what...?). You also have to take latency into account. Assuming clients can detect a desync (which again, is very hard), they have to request a full update, and using deltas, they have to know the "last good sync point" so they can request an applicable delta, and then they have to wait for the server to send it, then acknowledge they received it, and before you know it, 10 TICs have elapsed and you're dead anyway.
Furthermore, deltas grow in size the longer you don't send one; i.e., the delta between TICs 1241 and 1242 might be 80 bytes, and the delta between TICs 1241 and 1243 might be 120 bytes. If you do the research (as crazy old Dr. Sean did), you find out that you can't just send shitloads of bytes whenever you want; even if your server has something like 100mbit/s upload, routers will freak out at you and start dropping your packets on purpose (yes, this is how the Internet is actually designed to work, believe it or not). So you have to do some complicated stuff to ensure your delta sizes stay under a certain size and then "flush the buffer" as it were when they approach that size limit. At that point, you may as well just send the delta every TIC anyway.
Wow, maybe I do know something about it?? Hell, it's an interesting thing to try anyway. I'll have to write some code to fake latency on my LAN to try it out!
Yeah! Be careful though, netcode can kind of be a rabbit hole. We have yet to really get into things like unlagged, player movement smoothing, clientside prediction, collision smoothing, that kind of thing. You may find out it's way more work than you bargained for ;) I know I feel that way sometimes haha.
Dr. Sean linked some good stuff; you can also google around for Yahn Bernier, the guy in charge of netcode for Valve games.
I know Dr. Sean has a lag tool he wrote, but I use Zalewa's baller Gamer's Proxy. I thought I might write one in Go, but why?
You've already got some of this working, right? Is it completed, or in an alpha stage?
I just merged D2K's netcode back into the master branch. Again, it's not at all ready for public consumption; it needs a lot more testing and I've undoubtedly broken some things, but I'll be around to brag once I get it presentable ;) I will say that it performs amazingly in testing, but I'm biased.
EE's C/S netcode is based on network messages. I tried to ameliorate the problems with that architecture in various ways, and I ended up implementing crappy deltas. At this point, the C/S branch is extremely outdated. My current plan is to get D2K's netcode production-ready and cleaned up, port it to the latest EE, and send Quasar a pull request on GitHub. There is some stuff in the C/S branch that is still useful, WAD downloading, master advertising, banlists, etc., but the network message architecture is just "doomed" to failure (heh).
Actually, due to the way Doom works, the only packets that are really needed are the players', because their movements are the only truly unpredictable factor, but everything else (monsters, plats etc.) is 100% predictable if both the client and server use the exact same engine or they are 100% equipotent, and if the players' actions are followed correctly by all clients.
If you synchronize every TIC then yes, this is the case. That's how the original netcode worked. If you allow players to run the game sim, or their local commands, independently (asynchronously), then you enter this rat's nest.
Last edited by Ladna on 07-09-14 at 15:54