Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
kb1

Wad compatibility option database - ideas

Recommended Posts

I'm not sure that CSV is good for this as the format contains no information what the single parameters mean.

I think a 'key = value' format, like UDMF or Windows INI would be much better suited here. That way ports can skip those keys which they don't understand and the thing would still be editable in a text editor but unlike CSV the meaning of all would be immediately clear.

CSV will inevitably degenerate in a mess because it lacks extensibility.

I'm also not convinced that allowing editing a WAD while still keeping the settings active is a good idea. When I start messing around with a WAD it's not too much to ask to provide compatibility settings of my own. In ZDoom all critical ones can be set from MAPINFO so that a user can custom-configure the maps if he modifies them.

Share this post


Link to post

Honestly CSV is barely a standard; it's a term for loosely-related formats that are often fundamentally incompatible with each other. I've had to work with CSV files of various kinds just trying to import into PostgreSQL before, it's really a nightmare as every software author has different ideas of how to do various things in CSV (like escaping characters); "Even a college student still learning Java in CS2132" could make a dead simple parser for one particular flavor, but even the tools that exist out there, that have been worked on for years on end, haven't perfected it. CSV would likely be fine in Doom ports if we all agree on one format for it, but I haven't seen any such discussion here.

Personally I don't know why an INI format was rejected so quickly, it's a simple format, and even though it shares some of the CSV issues (not really being a standard and having a few variations), it's rather obvious to edit in a text editor rather than using a spreadsheet program as the preferred editor.

Edit: My post was written somewhat in reply to Quasar, not Graf. I'm glad to see he shares my same concern about using CSV.

Share this post


Link to post

The reason I personally reject an ini or UDMF like format is due to how unwieldy they are and how inappropriate they are for representing the type of data we are interested in.

To be honest it doesn't actually matter to me one way or the other what format this ends up in because if it isn't something scalable; I'll just support it using a offline encoder and treat the verbose format(s) as mere data collection formats.

Though personally I really cannot believe some of you are seriously considering a plain-text flat file "database".

Share this post


Link to post
DaniJ said:

The reason I personally reject an ini or UDMF like format is due to how unwieldy they are and how inappropriate they are for representing the type of data we are interested in.



... and you prefer something that will end up in chaos? Yes, that really makes sense. How would you organize such things? ZDoom alone already has more than 30 compatibility options that need to be handled.

ZDoom even has the option to alter linedef properties through its compatibility file - because that's the only way it can handle Strain MAP07.

Share this post


Link to post

I really do not mean this to sound aggressive, but ZDoom only has a load of compat options because it changed everything in the first place.

Comparatively, other ports have less compat options because they are more faithful.

That said, I would agree with the assertion that whatever format is used should have the ability to support a very large number of compat options; lot's of potential spaces for future use.

Share this post


Link to post

Indeed, ports may have compat options for reasons other than changes to the original behaviour.

Share this post


Link to post
Vermil said:

I really do not mean this to sound aggressive, but ZDoom only has a load of compat options because it changed everything in the first place.

Comparatively, other ports have less compat options because they are more faithful.

That said, I would agree with the assertion that whatever format is used should have the ability to support a very large number of compat options; lot's of potential spaces for future use.


So? Does that mean that you'd prefer a format that's inadequate to handle the necessary info just because your port of choice doesn't need them all?

Either do it right or not do it at all. Just looking at it from your little corner and ignoring the needs of other engines will make this thing DOA.

The bottom line is that with ZDoom, PrBoom+ and other engines potentially requiring other options the whole thing may explode into unmaintainability if a cheap-ass format like CSV is used much quicker than you may think. CSV is not a database format. It's not suited to maintain a database so it should be forgotten about immeditately, I think.

From a storage standpoint something like XML or JSON would probably be best but this will only be one more obstacle for implementors. But so would be a format without any structure at all. Once ports start disagreeing about its content all hope is lost.

So in my opinion a simple format that falls in between these 2 extremes and allows to assign values to properties is the only viable solution.

Share this post


Link to post

I fail to see how a few hundred options is a problem for the representation I proposed using a compound hash. Furthermore, small GUI applets can be written to produce the hash. In this representation the CSV is little more than a cached list of monotonic values, like a SQL dump ;)

Share this post


Link to post

Ok, time to scream: THE FORMAT DOESN'T MATTER YET! All that matters at this point is how are we going to get the data into a shared location. If the settings were available (in any old format) on a webpage, you could even set them manually in-game if you had to, when you go to play the wad.

Once a good set of data is there, we will absolutely need *some* download format. At that point, a program can turn that output into .ini files, a .csv, a SQL Server database, XML, whatever. It's really not a big deal to convert ANY dataset into your desired format, as long as all the data is there.

My personal choice is the .ini format. Parsing it is child's play, and it can have all manner of additional, useful data. It natively supports a sort of one-to-many relationship, which is my big argument against CSV. It could be useful for adding non-essential info, such as Wad Name, Author, etc. It could even store multiple wads if it had to:

[WAD]
Name=My New Wad
Author=kb1
file=mynewwad.wad
hash=40475937457394579345789
boomspecials=1

[MAP01]
archvileghosts=1

[WAD]
Name=Some Random Level
file=idgames\levels\doom2\s_t\srand.wad
file=srand_1.wad
hash=34859823458093480
hash=37457394579345789

[E1M1]
doom_stairs=1
In the above example, the file identifies 2 wads. The first wad has some additional info that could be displayed in a launcher, or in game. It also has a global setting "boomspecials" that applies to all maps in the wad.
The second wad has 2 file names, one of which contains the original idgames path. Both names and/or hashes would produce a positive match.

*However*
This is just one possible format, a format that I personally find helpful. Honestly, I don't know how hash fields, or CSV would encapsulate that much flexibility. But I would want whatever format we use to at least have that much flexibility.

Yes, this is difficult to get into a database field (but not a file). I would be willing to mock something up, as others have also offered, but, like them, I don't know how to go about hosting the site/database. Of course, the .ini file approach avoids all that, as it amounts to uploading a text file.

Share this post


Link to post

How can you say the format doesn't matter yet when you concede you have no place to host your ini flat file database?

I have access to a server and a cross platform desktop application with a plugin architecture which allows me to use HTTP to communicate directly with said server, and thus the SQL database that is hosted there. I could set this all up in no more than an hour or two.

In layman's terms, I could implement a Doomsday plugin that presents a graphical user interface for the user to change the map metadata that is automatically communicated back to the server.

In said plugin I could implement the de-/en- coder from any verbose hypothetical format (e.g., ini) and use that instead but its not really necessary at this stage. Especially if we are encoding to a hash because I can implement that in an HTML form as we've already discussed.

So from my point of view: THE COLLECTION FORMAT DOESN'T MATTER AT ALL!

Share this post


Link to post

Graf Zahl said:
From a storage standpoint something like XML or JSON would probably be best but this will only be one more obstacle for implementors. But so would be a format without any structure at all. Once ports start disagreeing about its content all hope is lost.

^

If you want to use a database that is hosted on a server, I would really look into NoSQL for this sort of thing. A SQL database is way too much overhead for something as simple as this.


Now, I'm trying to figure out if you should have settings for each map concerned with the WAD. Though this would allow functionality for each map, but it would also give us inconsistencies.

A quick and bad example would be CChest.wad, map29 requires that wall running is enabled in order to complete the map. Well, if that flag is only enabled for that map, you can say there is inconsistency in movement between all the maps and map29, so enable the flag across the entire WAD. I'm not saying this is the way to go, but it is something to think about. Personally, I would rather have a per map setting.

Share this post


Link to post

No offense but that sounds like MongoDB fanboyism.

Firebird is an extremely efficient database; it is holding 10 years of medical data at my job in less than 3 GB of disk space. It's also very fast.

Not that I am advocating its use. I just think all this NoSQL stuff is a fad started by a bunch of people who were upset because they skipped out on taking a file data structures class in college and don't understand why things like B-trees, indexes, and locks are necessary for scalability, performance, and data integrity.

Share this post


Link to post
Quasar said:

Firebird is an extremely efficient database; it is holding 10 years of medical data at my job in less than 3 GB of disk space. It's also very fast.

I use FireBird for all my projects for the latest 10 years :)

Share this post


Link to post

I haven't used a NoSQL database before. At work I don't even think we have used it, but it does come up every now and then.

There is some real use in NoSQL, and that it is non-relational. If you have data that isn't related, then it works extremely well. If you do have relational data, yea SQL all the way. :D

I haven't heard of Firebird, I'll have to check it out.

Share this post


Link to post

I agree with Graf that it should be XML or JSON, this isn't 1988 where everyone uses INIs for everything. The point of a compatibility database is for a linear data format, using would be a huge pain in the ass.

This:

[WAD]
Name=My New Wad
Author=kb1
file=mynewwad.wad
hash=40475937457394579345789
boomspecials=1

[MAP01]
archvileghosts=1

[WAD]
Name=Some Random Level
file=idgames\levels\doom2\s_t\srand.wad
file=srand_1.wad
hash=34859823458093480
hash=37457394579345789

[E1M1]
doom_stairs=1
Could be turned into this
<?xml version="1.0" encoding="UTF-8" ?>
<compatibilitydb>
   <wad file="somethin.wad" href="idgames/somewhere/in/id/games/somethin.zip" md5="59c0d428ee8a03e1cfadd7ca5f3bb421">
      <level lump="MAP01">
          <option name="broken-z-height">true</option>
          <option name="broken-bfg">true</option>
      </level>
   </wad>
</compatibilitydb>
Of course if the compatibility is stored within it's own WAD, then a self reference could be use.

If CSV is used, something like this could be done
somethin.wad,map01,broken-z-height,1,broken-bfg,1
somethin.wad,map02,line-123-is-broken-door,1
But there's a problem with text formats if there is no information to speed up access of the data. It will take forever to look for the WAD especially if the database is gigantic. Using separate files can speed this up but nobody likes having 1000 small files lying around.
[/code]Keeping a database online is bad because people with internet access will eventually get stuck in the level.

Share this post


Link to post

GhostlyDeath said:

If CSV is used, something like this could be done

somethin.wad,map01,broken-z-height,1,broken-bfg,1
somethin.wad,map02,line-123-is-broken-door,1


CSV is a format you definitely do not want to have data be stored and accessed for the purposes of what is trying to be done here. CSV as far as I know is commonly used to hold queried data and be opened in Excel.

GhostlyDeath said:
But there's a problem with text formats if there is no information to speed up access of the data. It will take forever to look for the WAD especially if the database is gigantic. Using separate files can speed this up but nobody likes having 1000 small files lying around.
[/code]

While keeping separate files may speed up things up, it will not be manageable. If you have 100 WADs, then that means you need 100 of these files. A SQL database could handle what we are doing, but IMO a NoSQL DB could be a better solution. http://ravendb.net/ RavenDB - "Raven stores schema-less JSON documents, allow you to define indexes using Linq queries and focus on low latency and high performance." Again, I haven't used a NoSQL DB, but I know what they are used for, and I do think it would work great in this situation. This doesn't mean SQL would be bad.

GhostlyDeath said:
Keeping a database online is bad because people with internet access will eventually get stuck in the level.

Well, it doesn't stop the player from using cheats or changing the settings on the fly. Unless it's online, then we have a problem. I don't know if we are worried about sourceports using this at the moment; this could be implemented in a launcher instead. Also, having a database online isn't a bad idea. However, my only argument not to use an actual online database is what if a user wants to add his own WAD settings? Well, he isn't the DBA so he can't really do anything about it can he?

Share this post


Link to post
GhostlyDeath said:

I agree with Graf that it should be XML or JSON, this isn't 1988 where everyone uses INIs for everything. The point of a compatibility database is for a linear data format, using would be a huge pain in the ass.

Please do explain to me how either of those would be easier to use from the sourceport's point of view (or indeed the database's) than the proposed compound hash representation. The only place such formats might be useful is during the data gathering phase and even then they don't make sense because you are asking a user to manually edit something that a tool is much better suited for.

Edit:
I will state that I have nothing against JSON or XML. In fact I use both regularly in my day job and have recently just implemented an XML output mode for Doomsday's Master Server. So don't think I'm discounting these formats for no good reason.

The use of a particular format has to offer intrinsic worth to the implementing application. JSON/XML do not offer any tangible benefit in this case.

Share this post


Link to post
GhostlyDeath said:

I agree with Graf that it should be XML or JSON, this isn't 1988 where everyone uses INIs for everything. The point of a compatibility database is for a linear data format, using would be a huge pain in the ass.
This:

...(rather elegant, easy to parse and read code)...


Could be turned into this

...(nasty looking nested .ini file turned sideways, which would require a library to read, or some complex code, is hard to write manually and get right, and is just damn ugly)...IMHO.

GhostlyDeath said:
But there's a problem with text formats if there is no information to speed up access of the data. It will take forever to look for the WAD especially if the database is gigantic.[/b]

You can't be serious, man. I can parse somewhere around 1 million text lines per second on my quad core without trying too hard. That's 10,000 wads at 100 lines each. Try that with XML.

GhostlyDeath said:
Using separate files can speed this up but nobody likes having 1000 small files lying around.[/b]

Agreed, and, so far, it's the only real valid argument against the approach. We're just throwing ideas around here.

My point is that, we need to be able to get an online shared dataset built. From that point, we can build hashes, CSVs, XMLs, INIs, whatever.

DaniJ said:

How can you say the format doesn't matter yet when you concede you have no place to host your ini flat file database?

I should have said the *output* format doesn't matter as long as it contains all the data. Yes, it has to be in *some* format, regardless of wheter or not I have the ability or know-how to host edits to that said format.

DaniJ said:

I have access to a server and a cross platform desktop application with a plugin architecture which allows me to use HTTP to communicate directly with said server, and thus the SQL database that is hosted there. I could set this all up in no more than an hour or two.

Awesome. I wasn't trying to get you or anyone else to do all the work (of course, the bulk of the work is in entering the data by the community)

DaniJ said:

...Especially if we are encoding to a hash because I can implement that in an HTML form as we've already discussed...

I can forego the additional text fields for a large set of bits that control options - sounds reasonable. But: Please leave the ability to expand the list of bit definitions - we will invariably need to expand the bit count in the future. That should be spec'd at the beginning. Oh, and 2 bits per option:

00 = off
01 = don't care
10 = special
11 = on
I think a 'Don't care' is important - there's no need to force all settings on or off. And, if you're going to use 2 bits, you could define "special" to mean "look into some other field for the correct value", to handle cases where boolean is insufficient.

Share this post


Link to post
kb1 said:

I can forego the additional text fields for a large set of bits that control options - sounds reasonable. But: Please leave the ability to expand the list of bit definitions - we will invariably need to expand the bit count in the future. That should be spec'd at the beginning.

Most definitely. I'm not trying to suggest that this format will be set in stone and we absolutely need to factor in mechanic to expand the hash when new properties become needed.

Oh, and 2 bits per option [i.e., multichoice preferences]

This is not something that the format should even consider. By their very nature compatiblity options are booleans, a map either needs option X or it doesn't. Whether or not an application or an end user should care is beyond the scope of the problem we are trying to solve (at least as far as this metadata definition is concerned).

This is an important distinction to make because the definitions of "don't care", "always override" etc, are port variant.

Such information would be better managed by each implementing application, using a set of preference options associated with the "concrete" hash.

Share this post


Link to post

Is there anything I can contribute to this discussion? Chocolate Doom only does Vanilla WADs, after all.

Only thing I can think of is the spechit magic numbers.

Share this post


Link to post
fraggle said:

...Only thing I can think of is the spechit magic numbers.

I don't think the spechit numbers really fit. They define what should happen when a spechit overflow occurs on a map, but doesn't that value change based on which engine created a demo? Although, I suppose they could apply strictly to demo lumps *within* a wad. In that case, it may make some sense to include, I think.

Edit: Maybe an option could represent that spechits overflow can occur (which would certainly break a map on a non-limit-removing) port.

DaniJ said: (regarding "don't care" and "special")

This is not something that the format should even consider...

I'm all for keeping it simple. Just having the booleans would indeed be a great advantage, and, if expandability is considered properly, we can always add new functionality as needed.

Going all boolean, I would suggest that a boolean "No" should represent vanilla behavior, so an option set of, say "00000000" = vanilla compatibility, just for sanity's sake.

Now, we should assemble a list of options, in the spirit of attempting to keep everything boolean (i.e. "Yes" or "No" answers to each option). I would ask that port authors contribute if you can.

Once again, we are not trying to reinvent a "universal MAPINFO" (although that would be a nice discussion at a further date). Rather, this would be a set of Yes/No options that, if not set properly, would cause a particular map to malfunction. This difference is subtle, and difficult to define exactly. I suppose each option should be voted upon.

Here's how I define "malfunction":

A. A map that is physically unfinishable without clipping.
B. A map that doesn't provide the proper ammo, pickups, due to, say differing voodoo doll functionality.
C. A map with doors/plats that don't open as intended (areas are unreachable due to options)
D. Maps that require/forbid jump/croutch/autoaim/lookup/down
E. Sky rendering errors caused by options
F. Maps with custom thing types/line types/sector types/etc.
G. Maps requiring a custom texture/sound/etc wad. <- Suggests more than boolean
H. Maps requiring DEHACKED lumps<- Suggests more than boolean

Yes, I added G and H, knowing they wouldn't fit into boolean. These things need to be discussed. It's easy to trun this into a MAPINFO format (Level Names, Par Times, Skies, bla bla), but please resist the temptation. One could argue that maybe Option E is a MAPINFO thing, since it doesn't "break" a wad. But G, and H would in fact will break a game, so they should be included.

I could use some help defining these requirements. We should use this thread to iron out the "rules" as well as the list of options, and, maybe when we get to a point, we can create a new thread that simply lists the results of our discussions.

Share this post


Link to post

FWIW, almost any kind of database idea (SQL or NoSQL) can pretty much be thrown out of the window for being overcomplicated (Hell, I clicked the Raven DB link, and it already can be disqualified as soon as it said "for the .NET/Windows platform"). The only one that might possibly be a consideration would be SQLite since it requires no server, no license hassles (public domain!), and can easily be incorporated into a source port's code base (no extra dependencies). Still, maintaining a SQLite database for WAD compatibility is overkill and not everyone wishing to add to the database will know how to do so.

About XML: I agree that it's overkill and overcomplicated for the purposes of this proposal (GhostlyDeath's example, btw, was not valid XML; anyone familiar with XML will know why).

About JSON: it actually could work rather well with this proposal, on the other hand, its libraries are simple to use, and it's not terribly difficult to understand by eye. I imagine something like:

{
    "some.wad": {
        "doom-stair-building": true,
        "md5": "cfcd208495d565ef66e7dff9f98764da"
    },
    "other.wad": {
        "md5": "c4ca4238a0b923820dcc509a6f75849b",
        "deathmatch-starts": false
    }
}
(I've actually purposefully made this seem somewhat odd; the order of entries in JSON does not matter, so no matter where someone decided to add an "md5" or any other field, a proper JSON parser won't care and will always retrieve the correct value)

Share this post


Link to post
kb1 said:

Going all boolean, I would suggest that a boolean "No" should represent vanilla behavior, so an option set of, say "00000000" = vanilla compatibility, just for sanity's sake.

Yes, this would be a very handy property for the hash. If implementers have to inverse a few logic tests to accommodate this - so be it.

kb1 said:

Here's how I define "malfunction"...

I don't think we really need to be defining a qualitative markup language for sorting these, not yet at least. All we actually need at this stage is a list of options from each port(/author) interested in joining the scheme, with an abbreviated summary of what each actually does in their own implementation effecting vanilla behaviour. Compat options which do not define behaviour but rather some trivial render-time detail like whether the ouch-face fix is applied, need not be included.

I'll get the ball rolling as Doomsday has few compat options...

Doomsday:
Game | Option
d|h|x | Doc | Name | Implementation
x - - Tag 666 any_boss_trigger_666 common fix
x - - Ghost Monsters archvile_noresurrect_ghosts on= PIT_VileCheck respects height+radius+solid, A_VileChase applies original height+radius
x - - Lost soul limit painelemental_limit_lostsouls map-global counter test in A_PainShootSkull
x - - Lost soul clipping problems lostsouls_stuck A_PainShootSkull uses P_CheckSides and within floor...ceiling tests to reject
x x - Monsters stuck in doors monsters_stuck_in_doors applies Boom fix.
x x - n/a never_hang_over_ledges applies Boom fix (i.e., avoidDropOff in newChaseDir and dropoff height limit in P_TryMove)
x x - n/a fall_under_own_weight applies Boom's torque changes in P_MobjThinker
x x - Non-newtonian corpses corpse_stair_slide applies Boom's changes to P_ApplyTorque and mobj X|Y movement
x x - n/a vanilla_exact_clipping large negative-offset fix in mobj X|Y movement
x x - n/a vanilla_exact_clipping_ifnot_wallrun exception to above for wall running, obtaining vanilla's only-north walls behaviour
x - - n/a zombie_exits prevent players with health <= 0 from triggering specials 11 and 51

chungy said:

No to databases!!

Sourceports and other implementers would not need direct access to the central database. They could use an offline copy in CSV (or other) format, which can be shipped along with the app and updated by the user manually if necessary (download and replace).

Share this post


Link to post

The only reason to use some type of DBMS is for the scalability.

JSON would totally work fine, if we do not want an actual database and it is stored locally. The only question would be performance later on, but I don't know what the performance difference would like against say 10,000 objects in JSON compared to a NoSQL 10,000 objects. Btw, NoSQL is meant to be simpler than SQL. And in case you are wondering, http://www.mongodb.org/ .

But anywho, good luck with the task though! :)

Share this post


Link to post

It would basically be up to individual implementers what local representation they decide to use. If their preference happens to be JSON then thats just fine.

However. If we do go with a hashed form in the central database and implementing ports; it makes zero sense to transform this into a more verbose representation just for an offline copy of the database. A CSV file of the hashes would be more than sufficient for this purpose.

Compat options should be per-map as collaboratively authored mods may contain maps which were singularly tested in a wide variety of ports. There is no logic which applies one ruleset across an entire wad other than that implicated by the port used to test an individual map by the original author.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×