Idea: METADATA lump for Slade and Doom Builder

Linguica · May 21, 2015

kb1 said:
I feel they have been answered, and the issues solved in a way that would provide a working, reasonable solution. I think we're there.

Let's not go overboard here.

As I see it, there are still a couple of major questions we need to nail down, including:

* Is metadata intended to be stored in a single file / lump, or multiple files / lumps? Or should the solution handle both in different cases, e.g., have one metadata file per directory for ZIPs/PK3s (like DaniJ suggested), and concatenate them all into a single metadata lump for a WAD?

* Related to the previous question: how is metadata reliably related to the asset it is describing? If assets have identical names, what is the method that the metadata uses to distinguish them?

* If we want to use hashes or CRCs to help identify assets, that will make it very difficult for people to use non-metadata-supporting tools to manually include their own metadata. Is this acceptable?

* How much duplication of metadata are we willing to have? On one end, we could list all assets, and for each, list all the metadata for that asset, even if that means repeating the author field over and over. On the other end, we could, say, list all unique metadata fields, and then for each, list which assets include that field.

kb1 · May 21, 2015

I must answer as if I were designing the spec, and writing the editor source code modifications to make it all work.

Linguica said:
Let's not go overboard here.

As I see it, there are still a couple of major questions we need to nail down, including:

* Is metadata intended to be stored in a single file / lump, or multiple files / lumps? Or should the solution handle both in different cases, e.g., have one metadata file per directory for ZIPs/PK3s (like DaniJ suggested), and concatenate them all into a single metadata lump for a WAD?

Metadata would be stored in a single lump per WAD. My spec assumes that, once an editor loads an alternate type of container file (Zips, etc), the editor will treat the loaded data as if it were a WAD file, assuring compatibility. WADs don't have folders. WADS do have a type of single-level sub-folder, implemented using the marker lumps. However, that is pretty much ignored by my implementation. For example, I no longer feel that map lumps should be treated any differently than other lumps - it's not necessary.

With that being said, editors should be able to handle the case where multiple metadata lumps exist, which can occur when using a non-compliant editor while merging WADs. (A compliant editor would know which WAD the metadata came from, and keep things in sync) The editor can parse them in position-in-WAD order, and, upon save, concatenate the data into one metadata lump.

Linguica said:
* Related to the previous question: how is metadata reliably related to the asset it is describing? If assets have identical names, what is the method that the metadata uses to distinguish them?

It is the supporting editor's job to write one metadata entry for each lump present, in lump order. The metadata entries will be in the same order as the lumps themselves. This works 100% if supported editors are used.

If an unsupported editor is used, identically-named lumps with identical contents could get mismatched. That cannot be avoided, but it can be handled a few different ways. We have not decided which one is best, but all editors should handle it the same way:

1. Continue to trust lump order/metadata entry order.
2. Drop the metadata for both, when 2 identically named, identical lumps exist, and metadata has been proven to otherwise not match lump data.
3. Prompt the user.

If it were me, I'd choose #1, because I believe it would cost the user the least bit of headache, and it would most-likely reflect the proper state of the data, in an overwhelming number of cases. Note that, as soon as a supporting editor saves the WAD once, there will be a tight correspondence between the lump directory, and the metadata. From that point on, you'd have to really be jumbling things up to get mismatches. One example is when you rearrange maps. But, if you're matching on both name and CRC, this is rarely an issue anyway.

Linguica said:
* If we want to use hashes or CRCs to help identify assets, that will make it very difficult for people to use non-metadata-supporting tools to manually include their own metadata. Is this acceptable?

Important distinction: Hashes don't identify assets, they only verify that a metadata entry matches the lump it claims to match. It's not a full-WAD CRC lookup. For a given metadata, you find each matching name, and then see if the CRC matches. Or the other way around (for each lumpname, find matching meta entry name, and check CRC). Of course, you have to be careful to preserve order for duplicate names (discussed elsewhere).

To clarify, I do not think humans should be directly editing the metadata lump. In my proposal, metadata is not a credit system, per se. If you want to include metadata, you have to use an editor that handles metadata, and provides text entry fields and dialogs to allow manual property editing (like Author). The whole "CRC verify for non-metadata editors" is a convenience only, designed to attempt to maintain the metadata created by a supporting editor, in the unusual case that someone opened up the wad in MyOldEditor, imported a few lumps, and saved it.

Linguica said:
* How much duplication of metadata are we willing to have? On one end, we could list all assets, and for each, list all the metadata for that asset, even if that means repeating the author field over and over. On the other end, we could, say, list all unique metadata fields, and then for each, list which assets include that field.

We're talking about a highly-compressible 1Mb, for a 10,000-lump wad with 100-characters-per-metadata.

--------------

Again, my implementation would require that you use a supporting (compliant editor), if you wanted to maintain metadata. And, due to careful generation of hash within that metadata, a WAD can survive the occasional edit from a non-compliant tool. This non-compliant tool would have no knowledge of metadata, so it would not track edits properly.

However, these untracked edits are immediately detected as soon as the WAD is re-opened in the compliant editor. This modern editor would be able to detect which lumps had not been tracked properly. When an untracked edit is detected, this modern editor has no choice but to delete the now-stale metadata. But, all of the other metadata stays intact. That's the beauty of including the hash.

We could drop that functionality, and just store a user-editable "CREDITS" lump, and the editor would not help maintain things like date/timestamps. But, we can do that anyway - just create a text file, and import it. Personally, I want my editor to do the tracking, and I want that data stored inside the WAD file being tracked, and I want to be assured that the computer will guarantee me that this metadata is either 100% correct, or was dropped due to a non-compliant edit.

By having this "insurance" in place, I am now comfortable enough to try to track some additional stuff, like Author, long lump name (for editor display purposes only), and other things. But, I go into it knowing full well that the metadata will only persist 100% if I use tools with proper support for it.

As you can see, the purpose and usage of metadata is narrowly defined by my spec. I state up front that you cannot depend on its existence, cause old tools do not preserve it well. But, I also claim that the format tries to survive such tools, which is powerful.

Here's a point we should discuss: It is possible to preserve some metadata even after a non-compliant edit:

Let's say I have a WAD with maps and some custom flats. This WAD has full metadata, cause it was built with a modern editor. But then, I open up the WAD in an old editor, and I touch up a flat, named "STEEL1". This changes the lump's CRC.

Now, I open up the WAD in the modern compliant editor again. The editor matches up all the metadata, except for this one flat. The editor knows that metadata exists for "STEEL1". The editor also knows that a lump named "STEEL1" exists. Finally, the editor knows that the current CRC does not match the metadata.

At this point, the editor might pop up a dialog:

----------------------------
"STEEL1", 5/18/2015 by kb1 has been modified in an external editor. Preserve metadata?

[X] Author
[X] Description
[X] Long name
----------------------------

At this point, the user can choose which fields to preserve. That would be a nice feature.

I hope I've successfully conveyed my thoughts on a simple, workable solution, for a very narrow, yet useful feature set. I believe it squeezes a nice handful of functionality without getting carried away. And, if offers custom properties which can be used sparingly, or dangerously.

Can it be abused? Of course. Will it be? Probably not so much. I have a feeling that people will appreciate the extra data, and even come to expect tools to maintain it in the future.

Ok, I'm really done :) I have a lot of RL to do :) Let's do a prototype, and see how it behaves.

Graf Zahl · May 21, 2015

After reading through all of this, my only remark can be:

This is a perfect lesson of how a subject can be discussed to death.

And I haven't seen any substantial input from editor developers here so it's quite clear that this will lead to nowhere.

Have fun, but be aware that you have to gain the support of the only people who could implement it for you!

Gez · May 21, 2015

Linguica said:
* If we want to use hashes or CRCs to help identify assets, that will make it very difficult for people to use non-metadata-supporting tools to manually include their own metadata. Is this acceptable?

http://www.softpedia.com/get/System/File-Management/HashMyFiles.shtml

Dragging and dropping a file on that program takes two seconds and yields the result.

DaniJ · May 21, 2015

Not to mention - how does this regulate the use of METADATA in a .zip file? Are we seriously considering a specialist tool for editing those?

Edit: Damn it! I said I wouldn't comment on this anymore... last time, promise. Ignore me, I'm out.

sirjuddington · May 21, 2015

Heh, well there certainly has been a lot of discussion in this thread and not a lot (relatively) to show for it, especially for what was initially such a simple suggestion. I agree with most of what kb1 has been suggesting in his last few posts, though, and for the most part that's how I would go about it myself if it was only for SLADE.

From what I can tell all that is really needed in the case of a wad file is a text lump named METADATA that contains extra info about each lump in the wad.

Personally I'd go with a UDMF-like syntax as both editors that could implement support already have parsers for UDMF.

So taking, say, PLAYPAL in doom2.wad, we'd have:

metadata
{
    name = "PLAYPAL";
    size = 10752;
    crc32 = a18cc8da;
    created_date = "1994-03-30 11:34:04"; // Example dates, not necessarily accurate :P
    modified_date = "1994-05-04 16:55:42";
    author = "id Software";
    source = "doom2.wad";
}

The METADATA lump would contain one of these blocks for each lump in the wad, and each metadata block can contain any arbitrary values the editor requires, in addition to the standard values above. Values unrecognised by the editor can just be ignored as long as they are retained. The only mandatory values would be 'name', 'size' and 'crc32'.

I don't think it's particularly important to worry about how mismatched metadata will be handled when we don't even have a real specification for how the metadata will be recorded yet.

I also don't think I've seen any of the Doom Builder devs in this thread, and without their input/support there isn't much point in a 'standard' METADATA specification either way.

Gez · May 21, 2015

I'd suggest just "created" and "modified" instead of "created_date" and "modified_date". Since it would also be used by plain old WADs, going for shorter keywords is better.

Also I don't think "size" is needed or really useful.

Linguica · May 21, 2015

Gez said:
Also I don't think "size" is needed or really useful.

"Size" is the only information about a lump (besides the name) that the WAD format already stores.

Gez · May 21, 2015

Yep. I figure the only possible reason to put size here is in the highly unlikely case a file would have a checksum collision with its previous version but a different size. And if we start being paranoid about that, we might instead add another kind of hashing, like md5.

kb1 · May 21, 2015

Graf Zahl said:
After reading through all of this, my only remark can be:
This is a perfect lesson of how a subject can be discussed to death...

True, maybe, but I must say that the discussion was good, for me, anyway - it helped focus me onto the core necessity.

Graf Zahl said:
...And I haven't seen any substantial input from editor developers here so it's quite clear that this will lead to nowhere...

$20? Oh wow, I thought SLADE was Gez's baby, but it's really sirjuddington's project, right?

DaniJ said:
Not to mention - how does this regulate the use of METADATA in a .zip file? Are we seriously considering a specialist tool for editing those?

Edit: Damn it! I said I wouldn't comment on this anymore... last time, promise. Ignore me, I'm out.

Yeah, it's being considered, for 6 pages of threads now.

I thought the .zip file implementations replace the functionality of WADs. Can't a zip file contain a lump, called "~META~"?

sirjuddington said:
...I agree with most of what kb1 has been suggesting in his last few posts, though, and for the most part that's how I would go about it myself if it was only for SLADE.

Thanks for the support. Yeah, we all went around and around for a while. It took me this long to settle on a very small spec, which feels right.

sirjuddington said:
...From what I can tell all that is really needed in the case of a wad file is a text lump named METADATA that contains extra info about each lump in the wad.

Personally I'd go with a UDMF-like syntax as both editors that could implement support already have parsers for UDMF.

So taking, say, PLAYPAL in doom2.wad, we'd have:
metadata
{
    name = "PLAYPAL";
    size = 10752;
    crc32 = a18cc8da;
    created_date = "1994-03-30 11:34:04"; // Example dates, not necessarily accurate :P
    modified_date = "1994-05-04 16:55:42";
    author = "id Software";
    source = "doom2.wad";
}
The METADATA lump would contain one of these blocks for each lump in the wad, and each metadata block can contain any arbitrary values the editor requires, in addition to the standard values above. Values unrecognised by the editor can just be ignored as long as they are retained. The only mandatory values would be 'name', 'size' and 'crc32'.

I don't think it's particularly important to worry about how mismatched metadata will be handled when we don't even have a real specification for how the metadata will be recorded yet.

On UDMF: Other than brackets and double-quotes, your format looks just like the .ini-type format. Works for me.

On mismathed data: If careful, you'll get support for it "for free", which, no, is not required, but here's what I think: You'll appreciated it if it's there, and cuss it if it's not. :)

sirjuddington said:
I also don't think I've seen any of the Doom Builder devs in this thread, and without their input/support there isn't much point in a 'standard' METADATA specification either way.

No, not yet. But, again, if the "mismatch"/missing metadata/unsupporting editor stuff is done correctly, it may not matter at first, if the final touch is to open the WAD in, say, SLADE, before publishing it. My guess is that most people will, to put maps in order, to finalize this and that.

It's a chicken and egg problem. I would much prefer support in the lump editor before having support in the map editor, cause the lump editor will naturally concern itself with all lumps.

Gez said:
I'd suggest just "created" and "modified" instead of "created_date" and "modified_date". Since it would also be used by plain old WADs, going for shorter keywords is better.

Also I don't think "size" is needed or really useful.

Agree on both points.

Linguica said:
"Size" is the only information about a lump (besides the name) that the WAD format already stores.

True. At one time, I wanted to add "Size" to disambiguate hash collisions, but the chance is so remote, it's probably not worth adding the extra ~15 bytes/lump to store it. We should keep it as simple as possible.

As far as coding support goes, sirjuddington has been commenting on it, so I assume he's considering giving it a shot...? And, ,again, I can type up some pseudocode, but the guys building editors don't really need my assistance :) As far as what the exact format will be, I think it's up to the first person that implements it. That will be the spec.

On second thought, here's some pseudocode, just so it's clear:

[Make metadata struct]:
type meta
{
  bool  used;

  int32 crc32;
  date  datecreated;
  date  datemodified;
  str   author;
  str   source;
  int32 fieldcount;
  str   fields[];
  str   values[];
}

[Add to lumpinfo struct]:
type lumpinfo
{
  ...
  meta  metadata;
}

[Add globally]:
meta  loaded_metadata[];

[Add code]
void LoadMetaDataEntriesFromAllMetaDataLumps()
{
  // clear array, then load and parse all METADATA entries, in order
  loaded_metadata[] = NULL;  
  for each lump named "METADATA"
  {
     parse data, and append it to loaded_metadata[];
  }

  set all loaded_metadata[].used to False;

}

int32 FindUnusedMetaDataEntry(lumpname, crc32)
{
  // search, in order for a matching entry that has not been used
  for each loaded_metadata entry x
  {
    If (loaded_metadata[x].used = False     &&
        loaded_metadata[x].name == lumpname &&
        loaded_metadata[x].crc32 == crc32)
    {
       return x;
    }
  }
  return -1
}

void GenerateDefaultMetaData(int32 lump_index)
{
  lump[lump_index].metadata.crc32 = CalcCRC(lump[lump_index].data);
  lump[lump_index].metadata.datecreated = CurrentDateTime();
  lump[lump_index].metadata.datemodified = lump[lump_index].metadata_datecreated;
  lump[lump_index].metadata.author = NULL;
  lump[lump_index].metadata.source = NULL;
  lump[lump_index].metadata.fieldcount = 0;
}

[At WAD load]:
{
  load lump directory

  LoadMetaDataEntriesFromAllMetaDataLumps();

  for each lump x
  {
    // generate default metadata
    GenerateDefaultMetaData(x);

    // find the matching metadata entry
    int32 loaded_metadata_entry = FindUnusedMetaDataEntry(lump[x].name, lump[x].metadata.crc32);

    If (loaded_metadata_entry != -1)
    {
       // found a name and crc32 match - this metadata is valid
       lump[x].metadata = loaded_metadata[loaded_metadata_entry];

       // mark the entry used, so it will not be applied to another lump
       loaded_metadata[loaded_metadata_entry].used = True;
    {
  }

[At WAD save]:
{
   Create new METADATA lump using lump[n].metadata
   Delete old METADATA lumps, if any
   Write new METADATA along with rest of WAD.
}

DaniJ · May 21, 2015

(In reply to a question you directed to me personally)

kb1 said:
I thought the .zip file implementations replace the functionality of WADs. Can't a zip file contain a lump, called "~META~"?

Well yes. However, it rather depends on what information you plan to place in the "~META~" file and the representation used, as to whether it is viable (or not) as a practical solution to the problem.

For example, will the user be able to simply open a .zip file using their preferred ZIP tool, drop in a new texture they made in their preferred paint package and then edit the "~META~" definitions accordingly, without having to use a specialist tool to rationalize the edit and update this data file on their behalf? If one can't do that - this seems like a significant inconvenience.

sirjuddington · May 21, 2015

Gez said:
I'd suggest just "created" and "modified" instead of "created_date" and "modified_date". Since it would also be used by plain old WADs, going for shorter keywords is better.

Also I don't think "size" is needed or really useful.

Good points. I'm going to attempt to write up a simple spec for this today, I'll keep that in mind.

As for how it will work with zips, the simplest way IMO would be to have it work exactly the same way it does in wads - a single METADATA entry in the root. Entries in directories would just have the path along with the name, eg. name = "decorate/monsters/imp.txt";

Alternatively, I'd be fine with storing the metadata along with the zip entries somehow too, though I'm not sure how well wx handles that, would need to check.

Gez · May 21, 2015

Long term we'll still need to write our own zip archive implementation instead of using the wx class because I don't think wx is ever going to support zip compressions other than "store" and "deflate".

Linguica · May 21, 2015

Just out of curiosity, does the UDMF syntax (or whatever it is) allow for nesting? Like, could you do:

some_item
{
	foo = "bar";
	another_item
	{
		blah = "whatever";
	}
}

DaniJ · May 21, 2015

sirjuddington said:
As for how it will work with zips, the simplest way IMO would be to have it work exactly the same way it does in wads - a single METADATA entry in the root. Entries in directories would just have the path along with the name, eg. name = "decorate/monsters/imp.txt";

So for each new texture I drop in my .zip I would simply open the ~META~ file in my preferred text editor to add the new entry in METADATA? A textual representation sounds much easier. So I would determine the base relative path to the new texture I just added and duplicate the path here, also? Sounds good, although I think I'll end up writing a script to do the donkey work, to avoid accidentally getting the path wrong.

andrewj · May 22, 2015

32 bits is too short for the hash (and the Adler32 algorithm is known to be give poor results for small data sets) -- use at least 128 bits like MD5.

I agree 'size' is not needed or useful -- the hash should be the only thing to validate that the metadata is valid.

Consider removing the the 'name' field too, and just find the metadata based on the hash, like how GIT works (if you don't know how GIT works, go read about it, very interesting).

I think maps need to be handled specially, treated as a single unit and performing the hash over the main level lumps (THINGS, VERTEXES, LINEDEFS, SECTORS, SIDEDEFS) but not the nodes, reject or blockmap data. Whether it should include BEHAVIOR for hexen maps is debatable.

sirjuddington · May 22, 2015

Linguica said:
Just out of curiosity, does the UDMF syntax (or whatever it is) allow for nesting? Like, could you do:
some_item
{
	foo = "bar";
	another_item
	{
		blah = "whatever";
	}
}

I can't see it mentioned anywhere in the UDMF spec, so probably not. However, the SLADE parser supports nested blocks like this, though I have no idea about DB.

andrewj said:
32 bits is too short for the hash (and the Adler32 algorithm is known to be give poor results for small data sets) -- use at least 128 bits like MD5.

Yeah this is probably better, I just had crc32 in the example as SLADE already calculates this. Shouldn't be hard to do MD5.

andrewj said:
Consider removing the the 'name' field too, and just find the metadata based on the hash, like how GIT works (if you don't know how GIT works, go read about it, very interesting).

I'd rather leave name in, for matching with empty entries if needed.

EDIT: I've written up a basic draft specification here: http://pastie.org/private/l99ekaw7parhz3jch8eb0w Suggestions? There are probably a bunch more optional fields that could be added as standard. I'm also unsure whether to have the dates in UTC time or just whatever the local system time is when they are updated.

Gez · May 22, 2015

Linguica said:
Just out of curiosity, does the UDMF syntax (or whatever it is) allow for nesting? Like, could you do:

UDMF itself, no. Nesting blocks was rejected specifically to make it simpler to implement a parser.

The parsers in SLADE3, DB2, and ZDoom, however, could all support them. Look at the configuration files for SLADE or Doom Builder and you'll see nested blocks, and DECORATE files are another example of using nested block since states are nested in an actor block.

andrewj · May 22, 2015

As the spec stands, this METADATA lump is like an additional directory to the one in the WAD or ZIP file, supplying optional metadata to each lump (or file) based on its name, and the MD5 hash is a secondary thing used to check that the lump (or file) has not been modified. If we called the METADATA lump a database, the 'name' field would be the primary key.

However, I think it can work better if the MD5 hash was the primary key to the database. There are issues to this approach too, but the main benefit is that lumps or files can be renamed or moved around and you will still find the metadata which belongs to it.

Either way, I think maps absolutely need to be considered as a single "object", possibly using a separate block keyword for them, and their MD5 hash should be over a subset of the map data (excluding generated data like the nodes or blockmap). A big issue there is that building nodes can modify the VERTEXES lump to add vertices for the segs -- so ideally the hash would exclude unused vertices. But that kind of processing would be quite a burden to a lump editor.

Also needed is special handling for textures in WAD files, since their data are split across the TEXTUREx, PNAMES lumps and one or more patch lumps. For the metadata to be useful for textures, it should "compose" each texture into a single unit and compute the MD5 hash of it. Again it may need its own block keyword to differentiate from normal lumps/files.

sirjuddington · May 22, 2015

andrewj said:
As the spec stands, this METADATA lump is like an additional directory to the one in the WAD or ZIP file, supplying optional metadata to each lump (or file) based on its name, and the MD5 hash is a secondary thing used to check that the lump (or file) has not been modified. If we called the METADATA lump a database, the 'name' field would be the primary key.

However, I think it can work better if the MD5 hash was the primary key to the database. There are issues to this approach too, but the main benefit is that lumps or files can be renamed or moved around and you will still find the metadata which belongs to it.

Either way, I think maps absolutely need to be considered as a single "object", possibly using a separate block keyword for them, and their MD5 hash should be over a subset of the map data (excluding generated data like the nodes or blockmap). A big issue there is that building nodes can modify the VERTEXES lump to add vertices for the segs -- so ideally the hash would exclude unused vertices. But that kind of processing would be quite a burden to a lump editor.

Also needed is special handling for textures in WAD files, since their data are split across the TEXTUREx, PNAMES lumps and one or more patch lumps. For the metadata to be useful for textures, it should "compose" each texture into a single unit and compute the MD5 hash of it. Again it may need its own block keyword to differentiate from normal lumps/files.

As far as how I plan to implement it in SLADE, the md5 is going to be the 'primary key' of sorts - I wouldn't trust metadata to be up to date if a name matches but md5 does not.

I did think of having maps as a separate block, but I think the map_md5 field for map header entries should be good enough, though really either way would work.

DaniJ · May 22, 2015

So a text version of kb1's model with all the same problems. Such a thing would basically force users to use SLADE to accomplish tasks with .zip files.

Gez · May 22, 2015

DaniJ said:
Such a thing would basically force users to use SLADE to accomplish tasks with .zip files.

You need a tool to generate tool-generated stuff, yes.

Zips have timestamps that can be used in lieu of the metadata, but for the rest, no. What's the alternative? Forking 7zip? Telling users they can conveniently type all that by hand for the 1783 files they have just imported? What?

If you really want to do it all by hand in notepad, again, HashMyFiles is an excellent program.

Anyway, users aren't forced to use SLADE for doing stuff with zip files. After all, they can very well not make use of this fancy schmancy metadata stuff. It's not like Doom modding didn't spend two decades without it...

esselfortium · May 22, 2015

This was always primarily a Slade feature request. Given the intended functionality of it I really can't imagine why anyone would want or need to write it by hand, any more than you would want to build a UDMF map using Notepad. It defeats the point almost entirely.

But you can, if you really want to.

(You don't want to.)

DaniJ · May 22, 2015

This whole line of thinking is based on fundamentally flawed logic. Please demonstrate how this kind of information is necessarily tool-generated stuff. If you design your syntax in a logical manner - editing this "by hand" in a plain old text editor is actually far more convenient for the user.

The premise here is that people would be forced to use SLADE if they wanted to produce mod that used a "common metadata standard".

Edit: And just so we are clear - I will not consider this kind of solution as any kind of "common standard"

esselfortium · May 22, 2015

The primary goals (as first laid out in the OP) are still to have metadata that automatically follows individual lumps when they are moved from wad to wad and to automatically keep track of update timestamps. Neither of those goals can be accomplished without using a "Specialist Tool". This is and has always been a feature request pertaining specifically to these tools. If you want a manually-written credits file that applies to an entire wad or directory, you can do that the same way you've done before, but it's totally unrelated to what's being developed here.

Gez · May 22, 2015

DaniJ said:
This whole line of thinking is based on fundamentally flawed logic. Please demonstrate how this kind of information is necessarily tool-generated stuff. If you design your syntax in a logical manner - editing this "by hand" in a plain old text editor is actually far more convenient for the user.

How so? If you want to tag 500 texture patches as having been created by John Doe, it's generally more convenient to select them all and right click->edit metadata->type "John Doe" in "author" field of the form, than to type out the name of each file and copy/paste author = "John Doe"; for each. But if you want to do that all by hand, it's still perfectly possible. Getting the hash isn't an obstacle, not when free and portable tools exist that can generate hashes for thousands of files in a split second exist.

I've used HashMyFiles and HxD to insert chunks in a PNG file "by hand", it's perfectly possible even if it's generally a better idea to use a graphics editor that supports the PNG standard.

If you're a POSIX wizard you'll probably be even faster with the command line and a handily crafted sed line with a lot of pipes, but in that situation piping in another program to get hashes isn't going to be scary.

DaniJ said:
The premise here is that people would be forced to use SLADE if they wanted to produce mod that used a "common metadata standard".

Yes, people need to use a standard-supporting tool to make use of a standard.

I don't get what your beef is. It makes no logical sense to me.

DaniJ · May 22, 2015

essel, there is no good reason why the two must be considered as logically different things. As I have argued from the beginning - this is a short-sighted and WAD-centric view of the world. As soon as you bring a container like ZIP into the picture the whole model being proposed here collapses in a practical sense.

There is every reason to expect a single solution which can be applied to both container-level metadata and file-level metadata. The key to solving this in a practical manner is to break the 1:1 relationship between metadata and individual files.

Linguica · May 22, 2015

A few things:

* I agree that if there is a metadata entry with a name and a mismatching MD5, the metadata should be discarded / regenerated. However, if there is an entry with a name and no MD5 entry at all, it should default to being accepted as genuine and a MD5 hash added when read with a conforming tool. I suggest this for two reasons: 1) the default mindset should be to accept data unless it is clearly wrong, and 2) this would allow people to edit / add data in a nonconforming tool in a pinch, and have it be retained and "legitimized" by SLADE later.

* I think automatically treating maps as a single "object" is more trouble than it is worth, both on a technical level as well as a more basic conceptual level. As I've said before, the lumps associated with a map should not be assumed to be standardized for now and for all time. As it is, there's already the schism between Doom's list of lumps, and the added map lumps that Hexen has, not to mention ZDoom's BEHAVIOR and whatever else it might have per-map. Having a "built-in" list of lumps that an editor combines to form metadata privileges the games / source ports that it directly supports and neglects the fact that either now or in the future there might be different per-map lumps that someone is using. Basically I think automatically characterizing lumps by their content / context is generally a bad idea.

* HOWEVER, this doesn't mean we can't add extra structure to metadata for a lump if we want to. In regards to the TEXTUREX lumps, we need to define extra data somehow if we want to be able to have metadata for individual textures. This is why I was asking if the format will allow for nesting. I think the simplest and most flexible way is to allow metadata entries to be nested, and then in regards to nesting, allow lumps to be typed in a few different ways: as a "group marker" which will be followed by a list of real lumps associated with that first lump; as a "virtual lump list" which will be followed by a list of virtual lumps stored within the lump, and which will require special handling; and then associated "virtual lumps" which aren't real lumps but which the tool has written special code to handle as lumps for convenience. So like so:

TEXTURE1
{
	lump_type = "virtual_lump_list";
	description = "The first internal list of textures";
	md5 = "2BF029D3AFFE3755F2063A5CD41C1AE5";
	AASHITTY
	{
		lump_type = "virtual";
		longname = "Placeholder - do not use";
		description = "This is a fake texture used to take up the zero slot in the texture array";
		md5 = "3F2F4295A5EB6AD967B832D35E048852"
	}
	ASHWALL2
	{
		lump_type = "virtual";
		longname = "A wall made of ash, I dunno, whatever";
		description = "you get the idea";
		md5 = "1223B8C30A347321299611F873B449AD";
	}
}

And then you could use nesting lumps to provide structure for maps if you so desired:

MAP01
{
	lump_type = "group";
	description = "MAP01 - zero-length marker lump";
	THINGS
	{
		description =	"Things for MAP01 - this is just a normal lump that is next
				in the linear list after MAP01, but is structured in the
				metadata as 'belonging' to it";
		md5 = "99754106633F94D350DB34D548D6091A";
	}
	LINEDEFS
	{
		md5 = "964D72E72D053D501F2949969849B96C";
	}
}

Altazimuth · May 22, 2015

DaniJ, from what I can see I feel your complaints are undermined by what seems like the fact that this feature appears to be intended to be automated, instead of manual. The whole idea, at least for me is that it REMOVES the fallibility of humans, who would not perfectly update credits lists.

Gez · May 22, 2015

Linguica said:
As it is, there's already the schism between Doom's list of lumps, and the added map lumps that Hexen has, not to mention ZDoom's BEHAVIOR and whatever else it might have per-map.

"ZDoom's BEHAVIOR" is the single added map lump that Hexen has. :P

If you want a source of extra map lumps, look up glBSP GL nodes, and also Strife Veteran Edition's lightmap lumps. :)

Sign In

Idea: METADATA lump for Slade and Doom Builder

Recommended Posts

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in