Linguica Posted September 8, 2015 Since 1989, PC users have known and loved the ZIP file format. Doom WADs benefited significantly from the compression, and it quickly became the standard way to distribute WADs (and LMPs) on the internet. Even today with PK3s and PK7s and what have you, the ****.zip filename is still standard operating procedure for distribution. This is going to become a problem, unfortunately. Google is moving to become the purveyor of .zip as a new top-level domain, like .com and .net and .world and whatever else. I've heard about this for a while, but didn't really think much of it. Then today I saw this: Where doom.zip was pointing to a nonexistent http://doom.zip/ . And with that one hyperlink, I realized that the war, if there even had been one, was lost. For 25+ years we've become accustomed to being able to say "blah.zip" and trust that the reader is confident that we're talking about a filename. No longer. As Twitter has already shown, a bare filename ending in .zip is now increasingly likely to be automatically interpreted by your given website as an URL link to an unrelated domain name. And this is sort of a problem for the Doom community, since we've accumulated tens of thousands of ZIP files, and those ZIP files are stored in a public-facing FTP archive, and it's very easy to refer to a file as "d2twid.zip" or what have you. And now in the future, if you happen to post on a website a link to http://www.gamers.org/pub/idgames/levels/doom/v-z/wow.zip You aren't going to know for sure if it's not going to come out the other end like http://www.gamers.org/pub/idgames/levels/doom/v-z/wow.zip I just finished rewriting the /idgames database to no longer use the .zip portion of filenames in canonical URLs to files, since they can no longer be implicitly trusted. (Incidentally, I had noticed that all the /idgames database URLs that ended in .zip were not being crawled by Google, even though they were perfectly valid, and Google absolutely refused to do so, even when given a sitemap. Coincidence?) I'm not sure what the endgame is here. With one fell swoop, Google has basically just assured websites that deal with ZIP files no end of grief; whereas up until now we could safely assume a link ending in .zip was a link to a real file, now we have to do some sort of check for if it's a valid domain name; and worse, now it's possible that a malformed url to a ZIP file will instead lead to a .zip domain name, perhaps one specifically set up to phish or otherwise do bad things to you. Yay technology. 0 Share this post Link to post
Chris Hansen Posted September 8, 2015 That was very informative and interesting, thanks for sharing your thoughts on it Ling. It's funny how the Internet and its associated big businesses operate. It still feels like we're in the old West or in its infancy in some regards. It's like there are no rules as to what is and isn't allowed. 0 Share this post Link to post
Quasar Posted September 8, 2015 Sooo how about when there are .txt, .wav, .jpg, .png, .rar, .7z, .mov, etc. TLDs? How about .wad? Do we just quit naming files with extensions entirely? >_> 0 Share this post Link to post
Jon Posted September 8, 2015 Interesting stuff, Linguica, but I'm not convinced that the TLD will spell the end of .zip. The heuristics people like twitter use to find and mark up URLs are exactly that: heuristics, and they make mistakes. The question is, how often is foo.zip going to actually be a URL, and how often is it not? That will depend on how popular the TLD becomes, and that's hard to predict. But Doom isn't the only thing using ZIPs, they're still utterly prevalent all over the place. (Personally I prefer them to .tar.gz on UNIX systems, because they have a ToC at the header and you don't need to read the whole file to figure out what's in it) I would not be surprised if twitter tweaked their algorithm and foo.zip fell back to not being a URL. (and damn Google for jumping on the goldrush bandwagon with these nonsense TLDs. I can forgive .xyz for being a) cheap and b) meaningless, which pokes the whole unrestricted TLD nonsense in the eye a little bit.) 0 Share this post Link to post
VGA Posted September 8, 2015 Why are the retards on google registering a tld with the same name as one of the most popular file extensions? Maybe the most popular, actually, after txt I guess. 0 Share this post Link to post
chungy Posted September 8, 2015 What's next? A ".com" TLD? How are we supposed to know the difference between an executable and a domain!? 0 Share this post Link to post
Quasar Posted September 8, 2015 chungy said:What's next? A ".com" TLD? How are we supposed to know the difference between an executable and a domain!? You laugh, but that's been an actual issue in Windows since IE 4.0 was integrated with the shell ;) 0 Share this post Link to post
Jaxxoon R Posted September 8, 2015 The only answer is to replace the .zip file extension with .google so that nobody is happy. 0 Share this post Link to post
ArmouredBlood Posted September 8, 2015 I guess I can't tell people to look up newgothic.zip anymore. Not that I normally add the .zip to it. From now on I guess people are going to make databases with .7z files instead. 0 Share this post Link to post
Graf Zahl Posted September 8, 2015 Quasar said:You laugh, but that's been an actual issue in Windows since IE 4.0 was integrated with the shell ;) Has it? I couldn't resist testing this, typing 'command.com' into the address bar of Windows Explorer and the results were what sane people would expect: - a file with the given name exists in the path: This file gets launched. - no file exists: Explorer tries to open a website with the given name. - Internet explorer always opens the website So obviously the local variant is given precedence, but only in Windows Explorer. Doesn't sound like a problem to me. 0 Share this post Link to post
Maes Posted September 8, 2015 That's why just looking at a "%s.%s"-type string should not lead to assumptions about what it is. It could be... A filename A URL (but without a protocol prefix such as ftp://, http:// etc., it should NOT be assumed it's a URL, even though in the context of a tweet it may make sense) An object/struct field-access statement in many programming languages etc. It's a typical context-dependent interpretation problem, and there's no single "correct" approach (other than leaving it alone and not trying to guesstimate. Screw "user friendliness". If a luser thinks it's an URL, well, have him copy & paste it himself). VGA said:Why are the retards on google registering a tld with the same name as one of the most popular file extensions? Maybe the most popular, actually, after txt I guess. Well, after all, there are only 17,576 TLAs. The most obvious "victim" is the .com extension, which however in modern OSes it's all but a legacy. I don't know if it's even possible to create a .COM executable with modern compilers. Also, what about the Aminet habit of storing files extension-first? E.g. "zip.doom", "bin.data", "mod.music" etc. 0 Share this post Link to post
LogicDeLuxe Posted September 8, 2015 Linguica said:http://www.gamers.org/pub/idgames/levels/doom/v-z/wow.zipI think, any link parsing software should be smart enough to notice that org is the top level domain here. And with the http at the beginning, a parser should even recognize unknown top level domain names. In fact, this forum automatically forms a link with anything beginning with or http://www. and is also smart enough to get something like http.pk7.org/www/http/7z/rar/ftp/zip/www.zip right. Incidentally, I had noticed that all the /idgames database URLs that ended in .zip were not being crawled by Google, even though they were perfectly valid, and Google absolutely refused to do so, even when given a sitemap.Now that is too stupid not to be intentional. Most retarded decision ever. At least, there's also an URL ending with .txt accompanying each entry, so they can still be found. 0 Share this post Link to post
Gez Posted September 8, 2015 Obviously the Doom community needs to take ownership of the .wad and .pk3 TLD. 0 Share this post Link to post
LogicDeLuxe Posted September 8, 2015 Maes said:The most obvious "victim" is the .com extension, which however in modern OSes it's all but a legacy. I don't know if it's even possible to create a .COM executable with modern compilers.It's not only a file extension, but also used for serial ports. If you try to use it as a file name on DOS or NT based systems, you'll be in trouble. It may even hang the computer under certain conditions. If you were really mean to DOS and Windows users, you could create a zip file in any competing OS and name all the files after DOS ports. 0 Share this post Link to post
Maes Posted September 8, 2015 LogicDeLuxe said:It's not only a file extension, but also used for serial ports. If you try to use it as a file name on DOS or NT based systems, you'll be in trouble. It may even hang the computer under certain conditions. If you were really mean to DOS and Windows users, you could create a zip file in any competing OS and name all the files after DOS ports. Heh I forgot about those, but I think they almost always use 4-letter identifiers (COM1, COM2, LPT1, LPT2, NULL, etc.) 0 Share this post Link to post
Jon Posted September 8, 2015 Gez said:Obviously the Doom community needs to take ownership of the .wad and .pk3 TLD. Last I checked it was around $150,000 to submit a request for a TLD (which might not be granted). Doom community kickstarter? :) 0 Share this post Link to post
LogicDeLuxe Posted September 8, 2015 Maes said:Heh I forgot about those, but I think they almost always use 4-letter identifiers (COM1, COM2, LPT1, LPT2, NULL, etc.)Well, you're right. You can name a file com in Windows. It refuses to name a file something like com1.txt, though. 0 Share this post Link to post
Maes Posted September 8, 2015 LogicDeLuxe said:Well, you're right. You can name a file com in Windows. It refuses to name a file something like com1.txt, though. Heh, there's something I didn't know. FWIW, the name of the null device is "NUL". Other reserved TLAs include AUX, CON and PRN. 0 Share this post Link to post
Walter confetti Posted September 8, 2015 interesting article, Linguica and this decision of naming .zip a domain just sound really dumb. As other people says, the .zip extensions is like the most used for data compressing, and lots of people use that, i really hope that google will change their plans... EDIT: just tested if zip can be downloadable or if gave me error and on my phone the link works, but not if i write file.zip in the URL bar, sending me to a .zip inexistent domain... EDIT2: oh boy, there is a .mov domain too? Sure is a pretty rare video format but this just silly... 0 Share this post Link to post
Jon Posted September 8, 2015 walter confalonieri said:EDIT2: oh boy, there is a .mov domain too? Sure is a pretty rare video format but this just silly... There'll be a TLD for any popular file extension in short order. What's silly is over-eager link-finding heuristics. 0 Share this post Link to post
Graf Zahl Posted September 8, 2015 Maes said:I don't know if it's even possible to create a .COM executable with modern compilers. You can rename any .exe into .com and it still will launch. The .com format is indeed dead as it's strictly DOS. 0 Share this post Link to post
Maes Posted September 8, 2015 Graf Zahl said:You can rename any .exe into .com and it still will launch. The .com format is indeed dead as it's strictly DOS. Out of curiosity, I explored a few ".com" files that still exist in the deep recesses of the Windows root directory (Windows 7, 64-bit SP1, for the record) with a hex viewer: they all have the "MZ" executable header, so I guess that this makes them technically ".exe" files, and that ".com" is simply considered as an alias for ".exe". Besides, I don't think a true .com file with 16-bit x86 code and exclusive 1-segment access could work under a 64-bit OS. 0 Share this post Link to post
LogicDeLuxe Posted September 8, 2015 Graf Zahl said:You can rename any .exe into .com and it still will launch.Just tried that. Does even work on 64 bit. I'm surprised that it even recognizes that extension and also tries to execute it, eventhough it doesn't support 16 bit software to begin with. 0 Share this post Link to post
Pencil of Doom Posted September 8, 2015 Damn, this sucks, it's one of the most stupidiest ideads that Google had. 0 Share this post Link to post
esselfortium Posted September 8, 2015 And this comes up after all the other good file extensions are taken as TLDs, too! If we had only had more foresight, we could've staked out a claim to .meme, .blackfriday, and .wang before they were all snatched up by the unrelenting corporate machine. 0 Share this post Link to post
printz Posted September 8, 2015 Hmm, indeed, I don't see THAT much use of zip files these days. Applications are downloaded as .exe, .msi, .dmg, .deb, .rpm. PDFs and other media are downloaded directly. You can't even download a .zip on an iOS device unless you have an app for it. And haha: if there's a need to package multiple files, laymen tend to use .rar files more often than .zip! Someone from this community should make or buy "doom.zip" as soon as possible! Even more, why not just grab them all? 0 Share this post Link to post
ptoing Posted September 8, 2015 Or just use .7z which has a better compression rate than zip as well. \o/ 0 Share this post Link to post
Jon Posted September 8, 2015 ptoing said:Or just use .7z which has a better compression rate than zip as well. \o/ or .xz that has better than 7z, or.... perhaps compression rate isn't the most important factor. (decompression *speed* is often pretty important too. And compatibility.) 0 Share this post Link to post
boris Posted September 8, 2015 Quasar said:Sooo how about when there are .txt, .wav, .jpg, .png, .rar, .7z, .mov, etc. TLDs? How about .wad? Do we just quit naming files with extensions entirely? >_> There is a .mov TDL. It's owned by Google. For all the lunacy look at https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains According to the article there's a whopping 1034 of them. 0 Share this post Link to post