Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
printz

-control

Recommended Posts

I saw this neat little release from doomworld/newstuff: http://www.doomworld.com/idgames/?id=16596

As it says, it allows you to control the player from an outside application. This would be very interesting to make bots without modifying the Doom source code. But if you move like that, do you get feedback from the Doom world? Or is the controller all blind like that?

Share this post


Link to post
printz said:

But if you move like that, do you get feedback from the Doom world? Or is the controller all blind like that?


Obvious: you build a bot with full Artificial Vision and Artificial Intelligence sophisticated enough to understand Doom's visuals, and do all that over the Internet while at the same time sharing the experience on Facebook, Twitter and other Social Media.

Nothing that your average web developer can't do for morning exercises.

Share this post


Link to post
Maes said:

Obvious: you build a bot with full Artificial Vision and Artificial Intelligence sophisticated enough to understand Doom's visuals, and do all that over the Internet while at the same time sharing the experience on Facebook, Twitter and other Social Media.

Nothing that your average web developer can't do for morning exercises.

No, that's too ambitious and would require too much fuzzy logic. It's hard to tell a computer what's bad and what's good.

Share this post


Link to post
printz said:

No, that's too ambitious and would require too much fuzzy logic. It's hard to tell a computer what's bad and what's good.


My engineer's rigid mentality also tells me that this is probably a task worthy of a (cooperative) Ph.D. or subsidized research, but the world is full of 15-16 whiz kids that can apparently churn a top-selling iPhone app out of the blue over a weekend with no effort, so I learned not to exclude anything in the IT field

Share this post


Link to post

Even without modifying the source code you could modify the .wad graphics to simplify automatic recognition.

Share this post


Link to post

Well of course it's not impossible, but Jeopardy and Doom are two entirely different beasts. And even in the case of Watson, it was fed the answers via plain text as they were revealed; there wasn't any kind of perfected speech recognition or OCR going on during it. Granted it's still impressive that Watson had exactly the same time frame to think about the answers as real human players have. Of course I just completely overshadowed the massive technical feat of language recognition Watson pulled off; Jeopardy requires a very good level of comprehension and ability to see through subtle puns.

Doom would require quite extensive analysis and interpretation of the audio and visual output, and I'm sure it would be even more difficult using the original 320x200 display rather than running it at 1920x1200 or whatever high resolution... this is of course assuming I read the topic right and you're not allowed to just write the bot in the engine or otherwise give it a clear understanding of the game state, but instead to give it exactly the same output to decide game play that human players have.

Share this post


Link to post

A Watson-like brute force approach could work. Instead of giving it access to encylopedia upon encyclopedia of information, you would give it demos of every online match ever played. Instead of finding Jeopardy answers, it will find how to win in any given situation.

How's that?

Share this post


Link to post

Well yeah after you got through the daunting task of making sense out of pure A/V input and create a world map (and its status) out of it, you'd be at the point of a conventional bot :-p

So even if you had perfect interpretation of the map (just as an in-engine bot would), it would still be a long shot towards seeing any kind of worthwhile gameplay, as all current limitations of bots would still apply (and then some, because the A/V bot would not have a way to "cheat" its way out of stupidity by e.g. knowing where the exit or the pickups are beforehand).

If you want the state of the art in this field, search for robotics teams that try to guide Lego Mindstorms contraptions in very simple mazes using brightly colored balls and walls as cues.

Or maybe something like this:



If there is ANYTHING in the world even remotely more advanced than this, it must surely be classified and developed by the military.

Share this post


Link to post
chungy said:

And even in the case of Watson, it was fed the answers via plain text as they were revealed; there wasn't any kind of perfected speech recognition or OCR going on during it.

I thought that was BS. With all the programming and and resources that super computer had, it would be trivial to add a webcam and make it play the same game as a human. I really think it should have had an actuator push the button to ring in, too.

Share this post


Link to post

Let's be clear that I'm not interested in complicated stuff like that. I'd rather find a way (if the processor or virtual machine lets me) to acquire data from Doom (actors, ammo, coordinates... probably everything P_). Does statdump.exe, -statcopy or something derivative help?

Share this post


Link to post

@printz: nope. Statcopy is simply a bunch of cumulative statistics. If you want 100% precision and direct access to the engine data, you need to run whatever code you want as a "plug-in" of sorts, or modify a port to be able to supply this information on-demand via an API.

I think this had been covered very extensively in a recent thread, too. The subject was different, but the core problem is exactly the same (and it's only the beginning of trouble).

Handicapping everything with a fuzzy A/V acquisition layer is something better left to someone with a few million dollars of budget, and to people willing to devote 2-3 academic years (aka base their degrees on it).

Share this post


Link to post
qoncept said:

I thought that was BS. With all the programming and and resources that super computer had, it would be trivial to add a webcam and make it play the same game as a human. I really think it should have had an actuator push the button to ring in, too.

Nope; see https://en.wikipedia.org/wiki/Watson_%28computer%29#Operation

Speech recognition or even character recognition were beyond the scope of the project. Quite simply, adding either one (or both) of them would be adding more complications and chances for error on the part of Watson and wouldn't be an effective demonstration or experiment.

Share this post


Link to post
Maes said:

I think this had been covered very extensively in a recent thread, too. The subject was different, but the core problem is exactly the same (and it's only the beginning of trouble).

I'm very serious about that. I may not have the time to work on it or may be distracted by other hobbies, but I find it perfectly possible to pass through a -nomonsters map.

Share this post


Link to post
printz said:

I'm very serious about that. I may not have the time to work on it or may be distracted by other hobbies, but I find it perfectly possible to pass through a -nomonsters map.


I do too, and furthermore I believe it's possible using only inverse goal analysis (top-to-bottom, e.g. from the end switch to the door next to it and so on), plain directed graph navigation plus some heuristics and randomized behavior here and there, in order to eventually navigate to a particular spot and avoiding unpassable/unescapable traps.

Share this post


Link to post
chungy said:

Nope; see https://en.wikipedia.org/wiki/Watson_%28computer%29#Operation

Speech recognition or even character recognition were beyond the scope of the project. Quite simply, adding either one (or both) of them would be adding more complications and chances for error on the part of Watson and wouldn't be an effective demonstration or experiment.

I was slightly disappointed at first as well, because I thought they had solved it in such a way that it did "the whole loop" (so to speak) including voice and character recognition.

But when I thought about it some more, it was quite honestly impressive enough in the things that it did manage to achieve. Plus it wouldn't really have adding anything to the demonstration. As the saying goes, do one thing and do it well.

Share this post


Link to post

Besides, other companies are working on that stuff (speech recognition and computer vision).

Only when AI doesn't require a brain the size of a Ryder truck will adding vision and hearing to it really have useful consequences, and then you'll see these technologies merge at that point.

IBM's current commercial goal for Watson is to have it answer directly input questions where it has a lot of time to think and come up with alternatives. It needs neither sight nor hearing to do this :P

Share this post


Link to post

Plus, as I recently read somewhere, there's no such thing as "perfect" vision or speech recognition, even for humans: the human eye can be fooled by trompe l'Å“il, and it's very common in everyday life not to understand someone because he's using some dialectal expression or his accent is slightly off.

It's not even possible to analyze plain text just based on the literal meaning of words without a context knowledge/some way of giving "street smarts" to the AI, if popular expressions, idioms or coded words/slang is used.

If speech recognition was ever used in a mission-critical setting, it would require operator training in order to use only unambiguous expressions and a particular bureaucratic-like tone, not unlike military orders, and "conversations" would have to be one way at a time (signaling the end of a sentence with an "over" like in CW radio or military comms). There's a reason why even humans use this form in such settings.

If you forfeit these requirements, you simply end up with meaningless chit-chat and a bunch of jokes about how "computers are so stupid, they can't understand even a simple thing".

Share this post


Link to post
chungy said:

Nope; see https://en.wikipedia.org/wiki/Watson_%28computer%29#Operation

Speech recognition or even character recognition were beyond the scope of the project. Quite simply, adding either one (or both) of them would be adding more complications and chances for error on the part of Watson and wouldn't be an effective demonstration or experiment.

Then it was just a stupid decision. If you're going to showcase a computer that can play the same game as a human, changing the rules of the game is asinine.

As I said, OCR should have been trivial to add, and if it put Watson at enough of a disadvantage that it couldn't compete, go back to work.

Share this post


Link to post
qoncept said:

As I said, OCR should have been trivial to add,


Sure, but would it be any good? What degree of reliability would it need to have to be "human grade"? 95%? 99%? 99.999%? What if it failed and lost time trying to decipher or misunderstood a question?

And would it need to use nothing more than standard TV cameras attached to a robotic head with focus no different than the human eye (aka no zooming)? Should it be optimized just for jeopardy's displays or more general purpose to avoid any and all criticism?

And BTW, is Jeopardy playable by on-screen cues alone, or are both hearing and vision required to make sense of it?

As you can see it would be a whole other set of problems to cope with, which would clearly cross the boundary of the AI they were trying to showcase here, and walk straight into robotics realm. Two different problems, two different worlds, two different tasks.

That being said, it would've been more awesome if they presented a human-like robot (even a non-walking square box) that could pack all the necessary processing power (including visual tracking and speech recognition) in a human-sized package, with at least as good performance in following the game as a human competitor. Maybe in 10 years from now.

Share this post


Link to post
Maes said:

Sure, but would it be any good? What degree of reliability would it need to have to be "human grade"? 95%? 99%? 99.999%? What if it failed and lost time trying to decipher or misunderstood a question?

That's why they play the games. I can't answer that, because they didn't make Watson play by the same rules.

And would it need to use nothing more than standard TV cameras attached to a robotic head with focus no different than the human eye (aka no zooming)? Should it be optimized just for jeopardy's displays or more general purpose to avoid any and all criticism?

I think you should be able to wheel the thing in and make it play the game. Really, there shouldn't be any special consideration that require any explanation to the audience. Network cable, power, interface to supercomputer in the back, fine. But the way Watson received questions had to be explicitly described.

Feeding text files to Watson would completely make sense if dictation and OCR software never existed. But they do, so I expect them to go in to making a computer that plays on a trivia gameshow.

Share this post


Link to post
qoncept said:

I think you should be able to wheel the thing in and make it play the game.


Which it did. So you could say that they achieved their end, even though the means were less than glamorous.

qoncept said:

Feeding text files to Watson would completely make sense if dictation and OCR software never existed. But they do


If they had a 100% accurate OCR that can work with the same conditions as the human eye and a "dictation software" that is at least as infallible as a man, that would be an achievement by itself, and would be touted in technological expos all over the world.

Only that the current "state of the art" is still mediocre and laughable, and would -pardon the pun- jeopardize the entire enterprise, if it was to be used.

qoncept said:

so I expect them to go in to making a computer that plays on a trivia gameshow.


Which they did. The "challenge" was to make a computer that could play jeopardy, not a ROBOT that overcomes still-standing voice and OCR problems.

Perhaps the best way to understand what was at play here is to sit back a moment and think how Deep Blue/Deep Thought and the best chess computers in general work: they are simply "thinking boxes" that think of nothing but the game of chess in a pure mathematical sense, and they are not build to concern themselves with e.g. visually scanning a chessboard or moving the pieces, nor do they need to in order to excel in what they do. Similarly, Watson doesn't need hearing and OCR in order to play Jeopardy.

It's a classic case of "I want my jetpack": people just expect too much without thinking of all the implications. Watson is not a fucking T-800, nor was it touted as one.

Share this post


Link to post

Watson wasn't even about playing Jeopardy, although it was an excellent way to test it out on what IBM was doing: trying to parse natural human language.

Actually, even implying that Jeopardy was the only purpose it was made for would seriously undermine the project. Think of robots (and as Maes mentioned earlier) that are designed to use cameras and navigate obstacle courses. A lot of them use courses out of LEGO or similar, they're easy bright color blocks to distinguish from, and programmers tend to get that down before getting more and more complicated and realistic terrain. the LEGO obstacle courses, or Watson playing Jeopardy for that matter, is just getting the technology's feet wet, so to speak, but it is by no means intended to be the end goal.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×