Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
sun_stealer

AI research and Doom multiplayer

Recommended Posts

Hi, Doom community!

 

I am a computer science Ph.D. student doing research in deep reinforcement learning. Some of you might know that Doom is a fairly popular platform amongst us thanks to https://github.com/mwydmuch/ViZDoom

In the last few years (`16-`18), even an AI competition has been held where bots competed against each other in FFA deathmatch. Despite the best efforts of multiple teams none of the participating agents were able to match a human player. This is what makes reinforcement learning research using Doom pretty exciting: if best AI agents we can make lose to humans even in this relatively simple game, there must be something we're missing (whether it's temporal/spatial awareness, memory, or something else)

Just as a clarification: here I am talking about agents that use raw screen pixels as input and that are trained from experience using machine learning, not the hand-programmed scripted bots we all know and love.

 

I am interested in implementing state-of-the-art RL algorithms to create robust Doom agents and I want to benchmark them against good human players. In particular, I am interested in deathmatch and duel scenarios.

 

I am completely new to Doom multiplayer and I am asking the community to help me create a fair benchmark.

First of all, I want to see with my own eyes what Doom multiplayer looks like (is it still alive?) What are the most representative maps and Doom versions that people use for duels and deathmatch?

I figured the most popular port is Zandronum, but there's a lot of stuff that I'm a bit lost. I need something that everyone can agree is "standard" Doom multiplayer experience, e.g. something like de_dust2 in CS or ctf_2fort in Team Fortress 2. Something that literally every Doom player is familiar with, something that is used in tournaments and casual games.

 

On top of that, I am asking whether people would be interested in participating in a small tournament against AIs, which will provide the information for a research paper?

Edited by sun_stealer

Share this post


Link to post

I would look into this if you're looking for the maps commonly played. Doom multiplayer is still alive, for sure. There are various discords you can look at such as MDF. In regards to ports - zandronum and Odamex are both used.

 

If you are looking for the dust2 or q3dm6 of Doom I'd have to say it's either doom2.wad map01 or dwango5.wad map01.

 

Edit: I should also say that this is very interesting and I wish you luck. I've thoroughly enjoyed watching the OpenAI stuff and would love to see something similar applied to Doom.

Edited by xvertigox

Share this post


Link to post

Hi,

 

There's a number of factors at play when picking what is "standard" (and there are many competing views on that). Easiest way would be to demarcate a line between oldschool (which has a more clearly designated "standards" or flagsets) and newschool (a bit more nebulous and has a variety of approaches). The most important distinguishing points between these 2 would be the enabling of jump, freelook, and finite actor height.

 

Oldschool mostly tries to emulate vanilla play in more modern ports such as Zandronum, though the other alternative choice would be Odamex. Odamex is arguably more suitable for oldschool play, and the maps that have been used in tournaments (most recently quake con I believe) would be SSL2, Dwango5 Map01, Judas23_, Doom2 Map01, and I think King1. Might've been Dwango5 map07, which is another viable standard map everybody knows. Those are for 1v1.

 

For FFA deathmatch, this question has actually been asked before on the zdoom forums, here: https://forum.zdoom.org/viewtopic.php?f=4&t=56552 which is mostly unchanged, the only difference being an additional newschool set made by the same people who made DBAB and Aeon, called NeonDM

Port choice is player preference, neither are "standard" but it's worth keeping in mind there are nuanced differences between them that changes feelings for players (Odamex is based on a much older version of ZDoom and CSDoom, while Zandronum is based on a much newer version of Zdoom). Zandronum supports more modern features which change map dynamics (such as 3D floors which allows room over room creation) while Odamex more definitely caters to old school

 

Newschool has less bound rules but there are many people willing to help you out, starting with the MDF yea. As for finding players, MDF and WDL are good places to look, I think Devastation is still around, Ralphis is still here, @dew can probably give good suggestions for viable players who aren't too immature for the research process.

Share this post


Link to post

@xvertigox @Decay thank you so much for your replies!

I see now, the difference between "oldschool" and "newschool" is very substantial indeed. The existence of freelook and jumping mechanic turns Doom into a completely different game! I'd be interested in using something that people play today, but this might be difficult if Vizdoom does not support it. I will check.

 

Meanwhile, does anyone have footage from recent Doom tournaments (duel, FFA)?

Best I could find, is this representative of Doom multiplayer in general? 

 

Share this post


Link to post

In my entirely personal opinion old school/classic Doom is the game to target. Once you add vertical mouselook, new weapons etc it changes the scope of the project enough that you'd be making an idtech1 agent which can encompass so many different weapons and map types.


The footage you showed is on point. If you'd like to see more recent footage I've linked a video below. You could also download Doomseeker and spectate matches yourself but it may be hard to find the specific kind of gameplay that you're after. There will be copious amounts of demos around that others could provide.

 

Spoiler

 

 

Share this post


Link to post

^ That's Dwango5 map01

 

I think this sounds really awesome and I for one would be happy to train our new robot overlords to kick our butts, or at least play a match or three :D

 

I think I also have to second what xvertigox said that you should prolly focus more on old school settings/maps to start things out, as the bots may not benefit from the added complexity of maps made in a new school style at the outset. I also want to note that Decay's reply is a great one.

 

Doom's mp is still alive but depending on time of day and day of the week you may or may not see matches going on.

 

If you plan on using Odamex then I'd recommend Doom Explorer; not sure if doomseeker works with odamex as thats a zandronum thingy but I could be mistaken on that.

 

There are a lot of little mechanics in doom that are useful wrt timing, planning shots and where to attack from. Certainly spatial awareness, where people can come from/go, spawn points, (spawn kills are a big thing in doom dm) places where certain weapons reign supreme, (like rockets in a long hallway) and other types of the more basic elements of strategy are good to have down, but there are also some lesser known things, or more-so really subtle things, such as bfg mechanics, or that hitscans like a shotgun usually hit better when fired from above. I'm not the best source of knowledge on this stuff by any stretch and I'm not sure how any of that would transfer over to a self-taught AI, but I think what you want to do is really cool and if you need a target dummy I'll be happy to run around aimlessly for a bit ^^

Share this post


Link to post

That video is Dwango5 map01, SSL2, and Judas23_

 

@Fonze Doomseeker supports odamex as well

 

Freelook and jumping are optional mechanics of course - the user does not necessarily need to use them in most maps, even in newschool ones. But I'd agree that oldschool is probably the best way to go for this particular project. Flags (settings) are different between ports so it'll be a matter of choosing which port you believe is best suited for the project (you could go chocolate doom for the most "pure" experience).

Share this post


Link to post

Yes, I agree with you guys, looks like old-school control style is the way to go. From what I've seen in the tournament videos, it definitely does not look like beating humans will be easy, actually, I expect humans to reign supreme for maybe another year or so.

DeepMind is doing similar work with Quake III, but even they couldn't dare to tackle the whole game yet: https://deepmind.com/blog/capture-the-flag-science/

This CTF mode is a very simplified version of Q3, without bunnyhop, without most weapons, etc.

 

Thanks for suggestions @Fonze, I will check Odamex and Doom Explorer now.

Share this post


Link to post

Very interesting topic! I thoroughly enjoyed watching the recordings of the prior AIs attempting FFA deathmatch. 

 

If you have any further questions on Doom multiplayer I would absolutely take heed of any advice @Decay can provide. He's a bonafide Doom multiplayer expert and I don't think there's anything worth knowing about it that he doesn't know. He's certainly my first port of call for any questions on modern Doom multiplayer. 

 

In terms of videos, I occasionally play "newschool" FFA with a few people on modern mapsets like NeonDM. They tend to be streamed, so you can watch recordings of such matches like here.

 

 

Share this post


Link to post

Hi sun_stealer,

 

Very interesting project.  I'd be interesting in assisting somehow, at the very minimum I can show you how to watch recordings (from the port.... not on Youtube!) and supply you with some great examples.  Funny enough, I am a chess player as well as a DM player... and your robots have already crushed us humans in that!

 

Once you've set an AI to learn for so many hours, I'd be interested in seeing some of the recording playbacks and give some feedback on what it has learned, and what it's still clueless on.

 

The interesting thing about reinforcement learning and doom, as opposed to chess, is how it considers scenarios.  Is it trying to adjust it's bias towards good things on a game by game basis... a frag by frag basis or a scenario by scenario basis?  Are you starting from ground zero of not knowing the controls, or are you starting it on the basis that it understands the controls?

 

Either way, I'd be interested in helping and also potentially playing in a tournament against AIs as you've stated.  If you're using classic rules and maps.

Share this post


Link to post

Hi, Devastation! Very good and relevant questions.

 

Quote

I can show you how to watch recordings (from the port.... not on Youtube!) and supply you with some great examples

 

That'd be awesome! If you can provide a few representative examples of high-level gameplay, I'd be very interested! Following the discussion above, I'll be most interested in "vanilla" Doom experience, at least without jumping and vertical aim. In particular, I'd like to see highly skilled deathmatch and duel gameplay on popular classic maps.

 

Quote

Once you've set an AI to learn for so many hours, I'd be interested in seeing some of the recording playbacks and give some feedback on what it has learned, and what it's still clueless on.

 

Sure, I will provide updates, and I think we can use this forum as a communication hub between me and the Doom community. This is working out great so far!

The potential training scheme will require billions of game frames of experience, which (at ~35 FPS) is on the order of years or even tens of years of playing Doom for the agent. In real time this will be days and weeks of training on a small GPU cluster. We don't have OpenAI resources to push it further than that, but it'll still be substantial.

Keep in mind that this is not a short-term thing, the expected timeline to be on the order of months.

 

I haven't really started working on it yet (finishing some previous stuff), but I already did some setup work and had my first "it's alive!" moment: 

Spoiler

 

This is the most primitive baseline algorithms training for just like 30 minutes agains easy bots, so far it only learned to shoot and generally stumble around. Has no clue about healthpacks etc, and doesn't have any memory, etc. But you have to start somewhere.

Quote

The interesting thing about reinforcement learning and doom, as opposed to chess, is how it considers scenarios.  Is it trying to adjust it's bias towards good things on a game by game basis... a frag by frag basis or a scenario by scenario basis? 

 

Training is usually is in so-called "episodic setting", in our case I guess one episode will be the entire duel/deatchmatch, and the final objective is to win the match. The agent will try to steer the future towards more favorable states (higher probability of winning, e.g. higher score, better position). The major difference compared to chess is that environment dynamics is considered unknown, we don't use Doom engine to consider different branching probabilities into the future, basically the agent has to figure out how the game works and how to plan directly from pixels by playing the game. Also Doom in partially observable, which means true state is fundamentally unknowable to the agent, therefore memory and awareness are very important. The chess AI, on the other hand, plays each move separately and independently (AKA Markov process).

 

Quote

Are you starting from ground zero of not knowing the controls, or are you starting it on the basis that it understands the controls?

 

I think from scratch is more interesting, although we might experiment with bootstrapping the agent from some human replays as DeepMind AlphaStar team did. Undecided at this point.

Share this post


Link to post

Sounds good.

 

I can provide you with all of the games from Quakecon 2013 (between myself, Demonsphere and JKist3) in demo format to view.  It's probably good to see a good player against an "average" player too, to gain a sense of how things can go wrong.  I'll dig up some historic games over the years.

 

You are stating that everything is done via pixel, are you allowed to use sound?  Sound is a major component of DM... if the bot is not allowed to base anything on sound, I doubt it could ever compete properly.  It will also need a way of recognizing and knowing the map, for backwards perfect movement, and timing.

 

A good example of timing in DM is the way one moves.  There is generally a tradeoff in DM movement, perfect moving (which is fastest) has the least visibility, so if you run into an opponent, you are in a bad spot.  Less optimal moving may be slightly slower, but grants the best visibility to be prepared if you run into an enemy.  If I know my opponents current location (thought sound or otherwise), and know that they cannot get to a specific location in a set amount of time, one would use perfect tight movement to get there as fast as possible.  If a player position is unknown, one would choose the safer movement style.

 

Some of these things may be hard to learn if an episode is an entire game.  There are many pieces in a game that can promote good play or bad play.  Think about yourself as a gamer.  If you get a frag, you generally say "wow I did something good", and if you die you generally think "I did something wrong".  Maybe an episode should be at least considered on a frag by frag level.  Do more of what gets me frags and do less of what gets me killed.  Going more advanced, in human terms, the outcome of many situations is actually a better or worse game state position from a given scenario, even if a frag was not achieved, such as dealing only 20% of damage but receiving 80% of damage.

 

Below is a list of things that every good DM player should be able to do, and some thoughts of how an AI could maybe learn these:

 

Good movement.  Computer likes when it moves and does not hit a wall, computer dislikes when it moves and touches a wall.  There also needs to be a way that the computer knows the level, so it can run backwards through the level without touching a wall.

 

Fast movement (sr40 and hopefully eventually sr50).  Computer detects pixels to gauge it's in game speed.  Computer likes when it moves faster, dislikes when it moves slower.  I assume eventually the computer would start randomly pressing enough buttons to figure out sr50.

 

Aim.  Computer likes when it fires and sees blood splats *OR* hears pain groan (sometimes you hear pain but cannot see it).  Computer dislikes when it fires and does not see blood splat or hears pain groan.  Computer should be weighted to liking more blood splats, it should eventually pinpoint the proper pixel to fire on for maximum damage, and hopefully understand the distance between players.

 

Damage.  As above, the computer can estimate damage based on blood splats and should like seeing more blood splats.  This will weight it towards the proper weapons in various scenarios as well.  It should know roughly how much damage was dealt in total over various situations, and how much health an enemy has left.

 

Damage balance.  This is difficult to describe, but it's knowing to fire at the closest range possible, just before your opponent reloads.  The basic premise in doom SSG battles if that you rush in while opponent is reloading, then your shot and then back away when you're reloading.  I have no idea how to help an AI learn this, but maybe the above notes on damage will just make it understand.

 

Sounds.  The computer needs to be able to pinpoint an exact location of an enemy and their *direction*, given it hears a sound (landing thud, damage, reloading shot, etc.).  This requires map knowledge.

 

Enemy timing.  Given a location of an enemy against passing time, the computer should know the furthest an enemy could have gone and the likelihood that it can happen.  This requires map knowledge and also how fast an enemy generally moves to get from any given location to another location.

 

Opponent state.  Computer needs to know weapons an opponent has for sure, weapons opponent likely has, the current ammo an enemy has for rockets and BFG, the enemy's current health.  The computer should be varying it's play eventually based on these game states.

 

Respawn whoring.  It needs to somehow figure out what optimal place to be when an opponent has just died, it should instantly start moving towards that direction given the known spawn points of the map.  It should be varying this based on current state (health, weapons).

 

Those are the basics.  If the computer could learn all of those things, it'd be about an average DM player, maybe above average given that it could probably aim perfectly.  Going more advanced, it'd need to learn how to manipulate it's own sounds to confuse or trick an opponent, how to guess an opponent's future plays and actions by associating it to previously seen actions, how to vary it's play away from the "best" play at times in order to stay unpredictable, and some other stuff.

 

Are you allowed to throw the AI into different maps for various periods of time to learn specific things?  For example, D5M18 would actually teach it much of the aiming and range that it requires to SSG battle properly.  You could also throw it into a map of just walls in order to get it to learn movement via not touching a wall and detecting it's speed.  It would hopefully learn how to strafe properly and know it's direction and where to go by being able to detect wall vs non-wall.

 

Also, some DM maps are simpler to learn.  For example, ssl2 is much more to do with aim, good movement and weapon control and not as much about determining an opponent's position using logic like D5M1 is.

 

Anyhow, hopefully this is of some help and gives you an idea of what you're getting into if you're trying to create a human-beating bot.  It also depends on your end goal, do you want it to be an average DM/tournament player?  Just be able to beat casual players?  Top notch gives the world elite a run for their money?

Share this post


Link to post

For some reason I always thought it would be trivial to make an "invincible" (or at least highly annoying) bot simply by exploiting its inherent strengths vs humans, like zero reaction time, internal knowledge of the game engine, dead-on accuracy with hitscan weapons etc. That's how AI opponents have been programmed for years, after all: with what to a human player would appear as borderline cheating (aka My Rules Are Not Your Rules). Come to think of it, a superior human opponent might appear to be just that [a cheater] to a lesser one ;-)

 

A good example where this was (ab)used is none other than Xaero from Quake 3 Arena. He is pretty much undefeatable unless you manage to find a hole in the AI and spawncamp him.

 

However, real-life DM games are often a super-brutal affair, and the top-ranking players in ZDaemon at least (e.g. Derrida) seem to have superhuman-like reflexes, never miss a SSG close-up shot to your face, take every opportunity to even lightly pelt a target from a distance etc. and in the end it's a contest about who'll rack up the most kills. The top players needn't even duke it among themselves for that to work (unless it's a 1-on-1). And let's not mention all those maps where the "gameplay" is reduced to camping/spamming the BFG and/or countering said camping/spamming. There's a lot of metagaming involved here.

Share this post


Link to post

Would love to help you research this (i.e. duel the hell out of Dev and others). Odamex doesn't have full servers all day, but it has a strong oldschool duel community that would probably love to help. It's easiest to round people up here: https://discord.gg/ZuwmxX2

Share this post


Link to post

Hi all! I have fully switched to this project and I think we'll be working on it for at least 2-2.5 months from now. I trained a couple of very simple baseline agents just to get a feel of what I'm dealing with, and now I'm working on infrastructure for distributed reinforcement learning to be able to leverage many machines and many GPUs to train big models on more data.

After this, the plan is to set up the self-play training where the agent battles against current and past copies of itself. This will be done with either embedded VizDoom multiplayer, or by modifying the engine to allow multiple agents to exist within the same process (not sure if this is possible without rewriting large portions of the engine code)

 

Thank you for your amazing feedback, and I apologize that I wasn't able to reply right away!

 

@DevastatioN

Quote

I can provide you with all of the games from Quakecon 2013 (between myself, Demonsphere and JKist3) in demo format to view.  It's probably good to see a good player against an "average" player too, to gain a sense of how things can go wrong.  I'll dig up some historic games over the years.

 

Right now we're not planning to leverage large amounts of demonstration data to train the agent, the plan is to train the agent to play against itself from scratch (~AlphaZero style). Although this might change in the future. That said, I'd really love to watch some more high-level gameplay (as you said, both pro vs pro and pro vs amateur). This will allow me to understand the task better. So I'd really appreciate if you could send me a few games that you find particularly remarkable. Let's maybe PM about this?

 

Quote

You are stating that everything is done via pixel, are you allowed to use sound?  Sound is a major component of DM... if the bot is not allowed to base anything on sound, I doubt it could ever compete properly.  It will also need a way of recognizing and knowing the map, for backwards perfect movement, and timing.

 

Very good questions. I am very excited about incorporating the sound because this is an almost completely unexplored area in AI (combining audio and visual modalities for decision making). Showing that the agent can perform better by using sound would be a very nice result.
The plan, in the beginning, is to push it as far as we can from pure visual input, and I hope this bot can reach novice/amateur human level. And from there we will add more domain-specific features, like sound, or maybe something else.

 

Map awareness and timing sense should emerge to a certain degree in the process of training, because we will train the agent on specific map(s), giving it a chance to memorize the layout.
On top of that, I have some ideas on how to encourage better temporal and spatial understanding during training, but I want to test these before disclosing. This is also a very interesting research direction, and largely unexplored.

 

Quote

A good example of timing in DM is the way one moves.  There is generally a tradeoff in DM movement, perfect moving (which is fastest) has the least visibility, so if you run into an opponent, you are in a bad spot.  Less optimal moving may be slightly slower, but grants the best visibility to be prepared if you run into an enemy.  If I know my opponents current location (thought sound or otherwise), and know that they cannot get to a specific location in a set amount of time, one would use perfect tight movement to get there as fast as possible.  If a player position is unknown, one would choose the safer movement style.

 

Gotcha. Learning more and more about the game mechanics I really started to appreciate this scene. I understand that beating a top human in something like Doom duel will be extremely hard, and therefore this is not necessarily the goal of this project right now. Rather, we want to push it as far as we can, and if we're not able to compete against top humans, we want to identify why, and maybe find out something interesting that current state-of-the-art is missing. This is not about building the best possible bot (because this would involve a lot of hacking, cheating, manual tuning), this is about doing impactful research.


 

Quote

 

Below is a list of things that every good DM player should be able to do, and some thoughts of how an AI could maybe learn these:

 

Good movement.  Computer likes when it moves and does not hit a wall, computer dislikes when it moves and touches a wall.  There also needs to be a way that the computer knows the level, so it can run backwards through the level without touching a wall.

 

Fast movement (sr40 and hopefully eventually sr50).  Computer detects pixels to gauge it's in game speed.  Computer likes when it moves faster, dislikes when it moves slower.  I assume eventually the computer would start randomly pressing enough buttons to figure out sr50.

 

Aim.  Computer likes when it fires and sees blood splats *OR* hears pain groan (sometimes you hear pain but cannot see it).  Computer dislikes when it fires and does not see blood splat or hears pain groan.  Computer should be weighted to liking more blood splats, it should eventually pinpoint the proper pixel to fire on for maximum damage, and hopefully understand the distance between players.

 

Damage.  As above, the computer can estimate damage based on blood splats and should like seeing more blood splats.  This will weight it towards the proper weapons in various scenarios as well.  It should know roughly how much damage was dealt in total over various situations, and how much health an enemy has left.

 

Damage balance.  This is difficult to describe, but it's knowing to fire at the closest range possible, just before your opponent reloads.  The basic premise in doom SSG battles if that you rush in while opponent is reloading, then your shot and then back away when you're reloading.  I have no idea how to help an AI learn this, but maybe the above notes on damage will just make it understand.

 

Sounds.  The computer needs to be able to pinpoint an exact location of an enemy and their *direction*, given it hears a sound (landing thud, damage, reloading shot, etc.).  This requires map knowledge.

 

Enemy timing.  Given a location of an enemy against passing time, the computer should know the furthest an enemy could have gone and the likelihood that it can happen.  This requires map knowledge and also how fast an enemy generally moves to get from any given location to another location.

 

Opponent state.  Computer needs to know weapons an opponent has for sure, weapons opponent likely has, the current ammo an enemy has for rockets and BFG, the enemy's current health.  The computer should be varying it's play eventually based on these game states.

 

Respawn whoring.  It needs to somehow figure out what optimal place to be when an opponent has just died, it should instantly start moving towards that direction given the known spawn points of the map.  It should be varying this based on current state (health, weapons).


 

What you're suggesting here is called "reward shaping". All these things can definitely help in the learning process but there's always a tradeoff: the more you modify your final objective by adding these auxiliary signals (e.g. positive reinforcement for fast movement), the more the agent will gravitate away from the ultimate goal, which is winning the match.  So ideally we want it to just maximize the probability of winning the match, and that's it. After a lot of training, these "auxiliary" objectives should emerge naturally, you can call them instrumental goals, proxies on the way to the terminal goal: winning the match. E.g. when the agent does more damage than the opponent in an exchange, it should increase the internal belief about the probability of winning the match, which will cause positive reinforcement. Ultimately, only experiments will tell just how much reward shaping we will have to use. Maybe some tricks are just extremely hard to learn from pure self-play and we'll have to "teach" the agent.

 

Quote

Are you allowed to throw the AI into different maps for various periods of time to learn specific things?  For example, D5M18 would actually teach it much of the aiming and range that it requires to SSG battle properly.  You could also throw it into a map of just walls in order to get it to learn movement via not touching a wall and detecting it's speed.  It would hopefully learn how to strafe properly and know it's direction and where to go by being able to detect wall vs non-wall.

 

Also, some DM maps are simpler to learn.  For example, ssl2 is much more to do with aim, good movement and weapon control and not as much about determining an opponent's position using logic like D5M1 is.

 

Anyhow, hopefully this is of some help and gives you an idea of what you're getting into if you're trying to create a human-beating bot.  It also depends on your end goal, do you want it to be an average DM/tournament player?  Just be able to beat casual players?  Top notch gives the world elite a run for their money?

 

The task of training a general-purpose bot that can beat a human on any map in any scenario seems very hard at this point. I think at the beginning we should focus on deathmatch rather than duel, as DM is less strategic and more reactive, and also we will probably focus on just a few maps. Also, I think duels are more interesting but much more challenging.

As I mentioned, the goal is to conduct interesting AI research, not necessarily building the absolute perfect bot. But we like to compare our AI agents to humans, because a skilled human can really demonstate what a truly intelligent agent can pull off, and how far we're from this level.

Share this post


Link to post
On 6/11/2019 at 2:03 AM, Maes said:

For some reason I always thought it would be trivial to make an "invincible" (or at least highly annoying) bot simply by exploiting its inherent strengths vs humans, like zero reaction time, internal knowledge of the game engine, dead-on accuracy with hitscan weapons etc. That's how AI opponents have been programmed for years, after all: with what to a human player would appear as borderline cheating (aka My Rules Are Not Your Rules). Come to think of it, a superior human opponent might appear to be just that [a cheater] to a lesser one ;-)

 

A good example where this was (ab)used is none other than Xaero from Quake 3 Arena. He is pretty much undefeatable unless you manage to find a hole in the AI and spawncamp him.

 

However, real-life DM games are often a super-brutal affair, and the top-ranking players in ZDaemon at least (e.g. Derrida) seem to have superhuman-like reflexes, never miss a SSG close-up shot to your face, take every opportunity to even lightly pelt a target from a distance etc. and in the end it's a contest about who'll rack up the most kills. The top players needn't even duke it among themselves for that to work (unless it's a 1-on-1). And let's not mention all those maps where the "gameplay" is reduced to camping/spamming the BFG and/or countering said camping/spamming. There's a lot of metagaming involved here.

 

Valid point. Let's be absolutely clear here, if we wanted to just build a human-beating Doom bot, we would do it via accessing the internal game state, hand-programming perfect aim, etc.

This is definitely a nice technical exercise, but this is not AI research. What we want to do is to build an agent that can learn how to play the game from experience and has the same access to game state as humans do, e.g. screen pixels & sound. Achieving something like this is a tiny step on the way to general intelligence - an agent that can make effective decisions in the real world.

 

Arguably, this "trained" bot should be also more interesting to play against. It is not hand-programmed so it should exhibit less robotic behavior, although should still be exploitable as it lacks general reasoning capabilities of a human and cannot quickly adapt. A good comparison can be AlphaStar StarCraft II bot versus in-game "Cheater" AI. A neural network based bot is definitely a more interesting opponent to play against.

Share this post


Link to post

I would like to see if and how soon an AI bot thrown into a BFG spamming/camping map would start doing so itself, and how effective it could become in pulling that off.

 

That would have the interesting side-effect/potential to throw off its goal-based learning, because doing so often gives the best K:D ratio in a map (by quite a margin, too) even if you don't get 1st place in the end . Speaking of being #1, you may be so in the short term "the easy way", unless your opponents start countering it (collectively "starving" the camper/spammer of targets works better than actively hunting him down).

 

In other words, it would introduce an "easy money"/"temptation"/"quick and dirty" mechanic to the learning process, which can be hard to outgrow even for human players.

Share this post


Link to post

I have a question for people who know how multiplayer works in different Doom source ports.

As you might know, VizDoom (a platform for agent training that we're using) is based on a relatively recent version of ZDoom.

On the other hand, Odamex is based on CSDoom, which is based on a very old version of ZDoom and features a different networking code (?)

 

Do you have any idea whether it'd be possible to establish a multiplayer game between these versions? Is it possible to create a VizDoom (ZDoom) server that can be joined using Odamex client or something else modern? Will my server show up in the Doomseeker/explorer?

 

EDIT:

Ok, I read some more documentation, and it looks like Odamex and other CSDoom derivatives use a different, more modern approach to networking, with a proper client-server architecture. ZDoom uses ancient networking code from the original Doom open-source release. So I don't think those two really play well. Another annoying fact is that ZDoom requires all clients to be connected for the deathmatch game to start, which is not the case for Odamex.

I guess I have only two options here:

1) port parts of the functionality of VizDoom into something like Odamex (e.g. python APIs, screen buffers, etc.) This is a big effort.

2) just leave it in ZDoom world and ask people to use something like GZDoom client to play against AI. This sucks because I wanted my server with AI agents to be always available in Doomseeker, which won't be possible.

 

I can do #2 first and if there's interest and resources I can think about #1.

Edited by sun_stealer

Share this post


Link to post

Not only will most ports not play well with one-another, in most cases, different versions of the same port won't play together either, especially anything based off zdoom as demo compatibility is not a focus, which basically means that there can and prolly will be tiny, obscure things, or large, obvious things, which will cause different versions of even the same program to process the same set of inputs differently.

 

All of it kinda overlaps with the demo recording/playback scene and is based around the same logic. Even the smallest change to the engine can affect how stuff gets moved, what kind of damage it takes etc which can thus snowball and affect other things, like one client not knowing its player is dead, or whether they picked up ammo/health or made a jump if say the physics push them too far in any direction from stuff like being hit or bumping into walls, players, decoration, and other "things." Tons of tiny things can cause issues later down the line to the point that it can even be tough to pinpoint at what point something went wrong that caused an eventual desync.

 

So the long of the short of it is you would probably have to have players playing the exact same program as you are if you do not port the bot stuff over to a different program, such as Odamex or Zandronum, in which case then you (or more-so your bot) will still be playing on the same exact program as others ;p And if Oda updates in the meantime just keep a copy of the version you use on-hand; likely a server can still be made for older versions?

 

Beyond this is out of my knowledge, so maybe somebody more knowledgeable can help you further on the rest.

Share this post


Link to post

This is my personal opinion but given the roadblock you’re facing, I’d say ask for volunteer deathmatchers and send them VisDoom so they can play against the bot-in-progress that way, locally, then send the results back to you.

 

It’s lame not having a 24/7 CS Doom server (Oda/Zan/ZDaemon) up that anyone can join, but at the same time sending it out to others for local testing will remove the insanely annoying element of lag as well. Additionally, from what you describe, trying to port VizDoom to Odamex will consume a lot of time for something that doesn’t directly result in improving the bot. Better to just keep using VisDoom, sending it to experienced deathmatchers, and worry about the roadblock of porting VizDoom’s functionality into Odamex later, if at all.

 

Time spent on this is valuable, no need to waste it trying to get foreign Doom ports to “talk” :)

Share this post


Link to post

I agree with @Doomkid, porting VizDoom to Odamex or Zandronum may take a really long time (possibly months of full-time work), and the synchronised networking of old DOOM is practically unusable over the internet, so I think sending out a VizDoom binary to people is your best option (assuming that is feasable).

Share this post


Link to post

Thanks a lot for opinions on this, I really appreciate! I agree that porting VizDoom functionality into other mods will not be a good investment of resources now.

@andrewj can you please elaborate on:

16 hours ago, andrewj said:

synchronised networking of old DOOM is practically unusable over the internet

 

Is it unreliable/laggy/unplayable or is it just inconvenient to use because everyone should be present before the game starts and there's no server browser?

I'd still love to be able to host my own server, even given the limitations, just because it is so much easier to set up AI bots locally. If I need to make a full local deployment (e.g. game with AI bots), I'd have to package the AI agents neural networks in some installable package, and that is a lot of effort, although much less than porting VizDoom.

 

FYI, there's also a github thread on this https://github.com/mwydmuch/ViZDoom/issues/392

Share this post


Link to post
7 hours ago, sun_stealer said:

Is it unreliable/laggy/unplayable or is it just inconvenient to use because everyone should be present before the game starts and there's no server browser?

Traditional DOOM networking works on the principle that every client is in lock-step with the rest. If you imagine each client is a big finite state machine, then every client is stepping through the exact same set of states at the same time.  If any client gets out-of-sync, then that client cannot continue to play since there is no mechanism to send the full state from a good client.  This is why all clients must be present at the beginning of a game and late joining is not possible.

 

The networking part sends the user commands (ticcmd_t) from your client to all others, and no client cannot step forward in time until it receives the ticcmd_t from every single client.  So lost packets means clients have to wait until they receive another copy of that packet, and one waiting client holds up the rest as explained above.  Since the internet is generally not super reliable, this means games over the internet can become very laggy.  Also ping time is a big factor too, even when there is no packet loss the client with the biggest ping will slow down the game for everybody.  So while internet play can work, I've heard that it is generally unplayable unless everyone is geographically close to each other.

Share this post


Link to post

Well, the typical solution here is to consider one the players as the "master" ( typically the hoster) and in case of missed sync/delays, any decisions it makes are final. If your shoot command came too late to the master, too bad, it probably won't register. If you died because of that, too bad, but "a frag is a frag". You get the idea. There might be stuff like client/server side prediction for things like movement, but important gameplay decisions are usually made single-sidedly by the master.

 

This is a compromise, of course, but it does allow for internet play with reasonably closely-pinged opponent. Depending on the algo used, players with either lower or higher ping might have an advantage, but usually high ping means you'll get fragged a lot, especially in 1-on-1 situations. You can still be a decent BFG spammer, in this case 😁

Share this post


Link to post

Just stumbled upon this thread. It's pretty cool!  If you ever focus on duel you should make sure you are training the bot on game settings consistent with one of the competitive standards in doom duel.  Differences in match settings that may seem minor to an outsider can change the cost/benefit ratio of many in game decisions. Making sure you can input the correct match settings into VizDoom will be important. Testing a duel bot vs humans on a competitive standard will also be the best measurement of bot performance as this is where the highest human playing level has been reached and these are the settings where the habits of the humans playing against the bots have been developed and are most applicable.

Share this post


Link to post

Again, thanks for the input guys, I really appreciate it!

Using about 100 hacks I was able to get this first self-play setup, where I will do my first multi-agent experiments: https://www.youtube.com/watch?v=dHGSZRFTnf0

These are 8 random agents (untrained) acting together in a single deathmatch, a scaffolding for the neural network training environment.

I was able to collect about 11500FPS of Doom gameplay on a single computer, which means that the agents will be learning at about ~300x speed compared to real time. But this is only the beginning.

 

@JKist3 I agree that duel should be the ultimate goal, and I anticipate this will be very hard for the current state-of-the-art agents. Which makes it that much more exciting, this is a very clear benchmark to beat! We're planning to focus first on like 8-player deathmatch which is much more reactive and chaotic and should be easier for the agents. Then we might start thinking about duels.

Share this post


Link to post

Some footage from intermediate tests. Here the task is to just find and kill as many monsters as possible in limited time. I was testing the compositional action space here, basically this agent can press all the available buttons at the same time, while previous bots were only allowed one action at a time (for implementation reasons).

 

Interestingly enough, the agent almost never runs out of ammo if pistol clips are regularly collected, therefore the agent didn't learn the concept of ammo conservation (at least not yet at this stage of training). This is reinforcement learning for you: you never know what kind of hacky strategy is going to emerge.

 

 

Share this post


Link to post

Some more intermediate results: 

Spoiler

 

Here the agent is training vs built-in scripted bots and has no problem outscoring all of them in 100% of matches. It definitely still has problems with the aim (this agent does not have access to the mouse, only turning buttons), and the overall strategy is very hacky. Hopefully, with self-play, we will be able to see more diverse tactics :)

 

These bots are pretty lame tbh. I add them via addbot console command, and as far as I understand, this creates a bot with random characteristics (aim, movement, etc.)

Does anyone have experience with Doom scripted bots? I would like to have it consistent (e.g. bots are the same across experiments for reproducibility) and potentially also make the bots progressively harder as the training progresses.

 

These videos are just sequences of raw neural network inputs, therefore the resolution is so low. I gotta work on hires rendering of the gameplay, maybe via replays.

Edited by sun_stealer

Share this post


Link to post
1 hour ago, sun_stealer said:

Some more intermediate results: 

  Reveal hidden contents

 

Here the agent is training vs built-in scripted bots and has no problem outscoring all of them in 100% of matches. It definitely still has problems with the aim (this agent does not have access to the mouse, only turning buttons), and the overall strategy is very hacky. Hopefully, with self-play, we will be able to see more diverse tactics :)

 

These bots are pretty lame tbh. I add them via addbot console command, and as far as I understand, this creates a bot with random characteristics (aim, movement, etc.)

Does anyone have experience with Doom scripted bots? I would like to have it consistent (e.g. bots are the same across experiments for reproducibility) and potentially also make the bots progressively harder as the training progresses.

Try using my TDBots: Link!
Remember to read the usage guide to understand how to set them up so that your bot doesn't have too much problem fighting them (By default, they have perfect aim and reaction time!)

 

If there's any feature you would want to be added to these bots, to help your neural network train, just tell me and i will look into adding it.

 

The only "random" things in these bots, how they run around the map (a dice roll picks if they turn left or right but otherwise it's the same mechanics), their names and their chat lines. The last two are completely cosmetic though, plus they are more configurable and much more skilled than the original ZDoom bots so your neural network will get quite a bit more challenge out of them.

Edited by -TDRR-

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×