Chocolate Strife: A_FireSigil

Blzut3 · March 18, 2016

kb1 said:
Ok, I have to call 'bullshit' on that one. No one claims that aggressive optimizations are boneheaded. Just about everyone would claim that erroneous optimizations are, in fact 'boneheaded'.

Undefined or not, if the optimization produces different output on vs. off, that's boneheaded.

And, no, I'm not naive, and, yes, I realize that the above statement is impractical to implement. But, you know why it is impractical? It's because of that word 'undefined'. Leaving something undefined, especially something that's next to trivial to fix, leaves these holes in the implementation that create the possibility for things to go wrong. Furthermore, an undefined behavior benefits no one - why defend it's right to exist?

In Entryway's example, the code as written by the programmer was well defined. Add 1000000000 to an integer. That has the same effect on a 32-bit integer, signed or unsigned. Yet, the compiler changed the code by defining a literal inside a for loop that was out of range for that for loop's variable. That is boneheaded.

Don't have time to reply to the rest, but I wanted to start here.

By your definition pretty much all optimizations are "boneheaded." Off hand I can't think of a single one which in some case won't cause different behavior in an optimized build vs an unoptimized one. I've not used a single compiler which never generates bugs that appear only on release builds for this exact reason. If everything was defined then indeed we wouldn't have these issues, instead the compiler would have to prove by static analysis that these optimizations can be made, and there would be far fewer optimization opportunities.

As for entryway's example, as I pointed out before if it's so obvious what should be done then why did Gez write the wrong value when unrolling the loop? (Since it's UB Gez's solution is technically still correct, but no compiler would ever make that decision unless int is 64-bits or it's specifically coded to handle that case since the types change.)

I'm not going to reply to the shift right stuff. It's probably something inherited from C and there has been no call to change it. That's not to say it would be changed if someone presented a paper on it, but this is one of the few cases where defining as arithmetic shift right probably could work. It is the generalizations to all UB that others are making that concerns me.

scifista42 · March 18, 2016

Graf Zahl said:
I have to say that this is my biggest gripe with C: The undefined size of integer types.

Is there any problem with using the "defined size" types from "stdint.h"?

Graf Zahl · March 18, 2016

Blzut3 said:
I'm not going to reply to the shift right stuff. It's probably something inherited from C and there has been no call to change it. That's not to say it would be changed if someone presented a paper on it, but this is one of the few cases where defining as arithmetic shift right probably could work. It is the generalizations to all UB that others are making that concerns me.

All I say about this is:

An oprerator that isn't defined across the entire value range of its operands is broken by design.
Exceptions are made for mathematically invalid operations like division by zero, of course, because there is no good way to define them.

In this case they should have prohibited shifting of signed values from the start.

That said, there is so much code out there whose life depends on stuff like signed shifts and integer overflows that any compiler developer would be in for a rough ride if these things worked differently - and pointing to the standard not defining this won't help them if their compiler fails to handle common production code.

scifista42 said:
Is there any problem with using the "defined size" types from "stdint.h"?

There's some weasel wording of 'at least xxx bits' in there somewhere which could render the entire thing worthless if some compiler developer wanted to cause trouble. But here the same goes as in my previous post: Such developers would have a very hard time justifying such decisions to their customers.

Quasar · March 18, 2016

I probably ramped up the rhetoric too high in suggesting that people behind these things have poor or bad motivations, but that's how it can seem from the perspective of a programmer the Nth time you get the brusque response that you shouldn't be causing undefined behavior, when the result is something that no human insight would reach without mentally compiling the code as you are writing it - that assumes both knowledge of the underlying machine code, AND how it might be optimized.

Some cases of undefined behavior are certainly called for, but there are some things I think should *not* be undefined:

Shift on any sort of number in either direction (already discussed ad nauseum)
Behavior of integer overflows, signed OR unsigned (make it wrap, always, obvious)
Size of types should just be frozen as they are currently, period; defer to C99 types for specific sizes and require them to exist - the compiler can emulate them in software if the hardware has issues - no more debacles like the multiple models for "long" under x64. Ridiculous.
Modulus on negative numbers. Just pick something already.
Comparing a pointer one *before* the beginning of an array. Machines have linear address spaces now. This should not be an issue for anything except something like DOS or a 65816 with segments.
Passing NULL to any C library function that takes a pointer as input. Except where this has a defined meaning already (and these should be standarized in many cases where people are being incompliant, such as fclose(NULL) or snprintf(NULL,...)), errno should be set or an error code returned as is the custom for the function otherwise. The extra error check is not going to murder performance.

I could go on, those are just a few of the most obvious programmer-antagonistic instances of undefined- or implementation-defined C behaviors that I have run into. When you cannot even depend on basic arithmetic to have defined behavior, you have a language design issue from my POV.

Graf Zahl · March 18, 2016

Quasar said:
Comparing a pointer one *before* the beginning of an array. Machines have linear address spaces now. This should not be an issue for anything except something like DOS or a 65816 with segments.

Now, that actually *IS* implementation dependent - or hardware dependent to be more precise. How would you do it in a way that's compatible with something like 16 bit DOS? You just can't take a valid address of the element before such an array if its base address is 0.

scifista42 · March 18, 2016

Graf Zahl said:
There's some weasel wording of 'at least xxx bits' in there somewhere which could render the entire thing worthless

int8_t, uint32_t (etc.) are exact-width types. Are you saying that they really aren't or might not be?

fraggle · March 18, 2016

Quasar said:
Some cases of undefined behavior are certainly called for, but there are some things I think should *not* be undefined:

I'd probably add the whole sequence points thing to that list. The fact that something as simple as:

a = b() + c();

or

a(b(), c());

can be ambiguous is pretty ridiculous nowadays. Actually I can't think of any languages I've used for serious work in the past 5 years apart from C/C++ which don't make this stuff explicit.

Blzut3 · March 19, 2016

kb1 said:
So, we can blame you for some of this? :)

Not until C++17.

kb1 said:
a compiler option

So one of the things you'll quickly learn if you ever work on the C++ spec is that the compiler doesn't exist. C++ is defined only as what makes a semantically valid C++ program. Either C++ checks array accesses or it doesn't. If a compiler developer wants to add an optional extension to check array accesses then that's fine.

kb1 said:
Hey, there's nothing wrong with compatibility. So, add a keyword for all the shiny new "C-squared" compilers that will cause old compilers to choke, but will enable new, 'defined' behavior. And, if it's silly to force a particular 'definition', give that power to the programmer. Instead of >>, provide assembly-like operators ROR, RLC, SLA, etc. And, yeah, it may generate a lot of code on some platforms, but, damnit, the language should cater to the programmer, not the other way around. I should not have to remember that "oh yeah, I'm using a signed number, everything goes to shit...". I expect the computer to work out those level of details. And, "integer overflow?" I consider that almost a misnomer. Integers don't overflow, they just count, and sometimes carry. It's like saying that my clock overflows every 12 hours.

While having a keyword to enter into a different language is a possible way to solve the compatibility problem, there's about a zero chance of it ever happening. At that point you may as well just start from scratch. Besides the C++ standard is big enough without defining two languages.

kb1 said:
Man, that sorta sounds a bit snotty to me, I gotta say. Besides, 'points' can't be naive - 'points' don't think. Personifying 'points' is an undefined operation :P

I'm going to file this under intentional misunderstanding of idioms. A naive point is one made by someone who naively assumes that it's actually a good point. So yes I was referring to the person who made them.

kb1 said:
So, not all platforms have numeric coprocessors, so don't provide multiplication, just let the poor programmer write their own implementation. Yeah, that makes sense?

I assume you say this focusing on signed shift right. The performance I'm talking about involves removing the need for branching in all code because there's a chance that the operation performed may be out of range. For example why does C++ define type alignments? If it didn't then one would have to assume that pointers could be misaligned. If the architecture doesn't support unaligned read/write then it would need to insert software handling at every pointer access it can't prove to be aligned. In the case of x86 assuming pointer alignment allows vectorized code to be smaller and contain less branches.

The arithmetic stuff either allows C to map closer to more machine level instructions, and/or empowers the optimizers to hoist operations out of loops based on the assumptions granted by the standard. These do allow well written code to run faster, but I'm personally more interested in the more subtle undefined behavior which allows optimization opportunities that are actually hard to write equivalences in C++. These usually draw even more fire from developers since it's harder to explain why the optimization is useful.

Quasar said:
A bunch of C stuff

Some of these probably wouldn't exist if C++ wasn't based off of C. Off hand I don't have arguments for them, but there's probably someone that does. I personally don't think C is a very well designed language, it just happens to be good enough and widely adopted.

fraggle said:
I'd probably add the whole sequence points thing to that list. The fact that something as simple as:
a = b() + c();
or
a(b(), c());
can be ambiguous is pretty ridiculous nowadays. Actually I can't think of any languages I've used for serious work in the past 5 years apart from C/C++ which don't make this stuff explicit.

This has a very very low chance of changing. If C++ were an upcoming language it would be easy to solve, however right now it would be an ABI break. Even more importantly an ABI break with C. This means if C++ were to define evaluation order as left to right and the target platform used to use right to left then it would need to take the performance hit of flipping the stack every single function call. Or at least on every function call it can't prove can be evaluated reordered.

Graf Zahl · March 19, 2016

Blzut3 said:
This has a very very low chance of changing. If C++ were an upcoming language it would be easy to solve, however right now it would be an ABI break. Even more importantly an ABI break with C. This means if C++ were to define evaluation order as left to right and the target platform used to use right to left then it would need to take the performance hit of flipping the stack every single function call. Or at least on every function call it can't prove can be evaluated reordered.

I see you are perfectly suited for committee work, justifying shitty design with keeping compatibility with shitty design.

Obviously things won't improve if handled like that.

So how did C get away with something as nasty as strict aliasing rules then? That broke a lot more legitimate stuff without providing any visibly advantage outside of micro-optimization.

If C managed to make C99 somewhat incompatible with C89, why can't C++ do the same?
C++11 already did include some changes that required code adjustments - all I say is 'narrowing conversion'.

Why are these undefined holes such sacred cows while the other far minor problems get addressed?

If something is left undefined because architectures may not agree about it, such a feature is better left out completely, i.e. the compiler should throw an error or a warning - as it is worthless without definition.

Blzut3 · March 19, 2016

Graf Zahl said:
I see you are perfectly suited for committee work, justifying shitty design with keeping compatibility with shitty design.

Obviously things won't improve if handled like that.

So if the next version of Visual Studio inserted code to flip the stack every function call you wouldn't complain about it making code that runs slower than 2015? I highly doubt it.

I'm sorry, but ABI compatibility is huge since it's inevitable that you're going to work with a older library (or how about just the Win32 API? What would users think if every single Windows application stopped working because the C standards people decided to define the evaluation order?) or your middleware is going to be compiled with an older compiler.

Graf Zahl said:
So how did C get away with something as nasty as strict aliasing rules then? That broke a lot more legitimate stuff without providing any visibly advantage outside of micro-optimization.

I fail to see how removing pointer dereferences is only a micro optimization. Pointer dereferences almost always come with a cache miss.

I don't even think the strict aliasing rules came about as a real change. I think someone realized that optimizers where caching the result of pointer dereferences and demonstrating that the standard didn't technically allow that. Thus the errors are indications of where the optimizer could have gone wrong whether it did in practice or not.

Graf Zahl said:
If C managed to make C99 somewhat incompatible with C89, why can't C++ do the same?
C++11 already did include some changes that required code adjustments - all I say is 'narrowing conversion'.

API breakage is actually easier to justify than ABI breakage. API breakage requires adjustments to your source files, ABI breakage requires recompilation which you may not be able to do.

Graf Zahl · March 19, 2016

Blzut3 said:
So if the next version of Visual Studio inserted code to flip the stack every function call you wouldn't complain about it making code that runs slower than 2015? I highly doubt it.

I wonder why it should do that. Even if it was necessary to add arguments out of order that won't produce much of an overhead because it'd just move the stack pointer in front of the arguments to be pushed first and then write those directly to the desired addresses. Which in some cases it already does.

I'm sorry, but ABI compatibility is huge since it's inevitable that you're going to work with a older library (or how about just the Win32 API? What would users think if every single Windows application stopped working because the C standards people decided to define the evaluation order?) or your middleware is going to be compiled with an older compiler.

Utter nonsense.
First, there is no ABI compatibility between different C++ compilers. Each does its own thing - even different versions of Visual Studio tend to have problems with this.

How about that? Finally DEFINE an ABI for C++??? The current situation is a complete nightmare with each compiler doing its own thing. Visual C++ even uses a different calling convention here than for plain functions.

So whatever is to preserve here, certainly isn't worth it.

There is ABI compatibility for pure C functions and that can be preserved. In fact this will have to be declared 'don't touch'. And Windows's API doesn't even use the standard C calling convention so special handling is needed anyway.

I fail to see how removing pointer dereferences is only a micro optimization. Pointer dereferences almost always come with a cache miss.

Now, isn't that hypocritical? In the name of performance (and from experience a pretty negligible one in the vast majority of cases) the programmers are presented with some breaking changes, but for making the language robust, idiotic arguments about a theoretic performance impact are pulled out? For what platforms? Either old and obsolete ones or new ones made by people who take 'undefined' in a bad standard literally. If those ever surface in the future it's clear what made them possible. If the standard didn't have these undefined-ness, nobody could afford creating non-compliant hardware in the first place!

API breakage is actually easier to justify than ABI breakage. API breakage requires adjustments to your source files, ABI breakage requires recompilation which you may not be able to do.

Which would be a valid point if C++ even had a stable and well-defined ABI to begin with. But since it doesn't, why even bring it up?

Blzut3 · March 19, 2016

Graf Zahl said:
Which would be a valid point if C++ even had a stable and well-defined ABI to begin with. But since it doesn't, why even bring it up?

The platform defining the ABI rather than the language doesn't mean it doesn't exist. It just means its out of the language's control. C++ having a defined ABI wouldn't bring as much benefit as you think it would.

Also Visual C++ seems to make a bigger mess of it than other platforms.

Gez · March 19, 2016

Name mangling is not defined by the C++ standard, is it?

Blzut3 · March 19, 2016

Gez said:
Name mangling is not defined by the C++ standard, is it?

Don't believe so. But on Linux and OS X compilers interoperate with each other all the time (Clang and GCC use the same mangling rules).

On Linux the non-C++11 complaint COW std::string has been kept around in libstdc++ in order to not break ABI and allow newer compilers to interoperate with code made by older compilers.

kb1 · March 21, 2016

Quasar said:
...I could go on, those are just a few of the most obvious programmer-antagonistic instances of undefined- or implementation-defined C behaviors that I have run into. When you cannot even depend on basic arithmetic to have defined behavior, you have a language design issue from my POV.

It's even deeper than that. The programmer sometimes ends up implementing his/her own workaround which is probably slower than what the compiler can do, and easy to get wrong. Then you publish that library, and the next programmer down the line doesn't trust the library, and implements another round of workarounds. It's ugly, it's unnecessary, and there's really no defense for it, regardless of platform.

Blzut3 said:
(a lot of technically correct stuff (maybe), but not getting it)

You know, at the end of the day, people actually have to use your spec, and, in that scenario, yes, there is a compiler. You know, the real world?

Here's a thought: Saying that a particular operation is undefined...is actually a definition. How about actually defining a useful result? If the compiler is allowed to produce any result, let it produce a meaningful result. What's the harm there? (Other than the ability hide behind an 'undefined' label, that is?)

Graf Zahl · March 21, 2016

Blzut3 said:
The platform defining the ABI rather than the language doesn't mean it doesn't exist. It just means its out of the language's control. C++ having a defined ABI wouldn't bring as much benefit as you think it would.

Actually, just saying that a method call just acts like a regular function with the class pointer being the first argument would resolve all issues here.

And if the standard defined name mangling we had perfect interoperabilty instead of the current mess which gets propagated by an ignorant committee.

kb1 said:
Here's a thought: Saying that a particular operation is undefined...is actually a definition. How about actually defining a useful result? If the compiler is allowed to produce any result, let it produce a meaningful result. What's the harm there? (Other than the ability hide behind an 'undefined' label, that is?)

It's the age old fear that defining a spec too strictly may produce inefficient code. And forgetting in the process that NOT defining such things will only help create code that is even less efficient because it needs to work around the deficiencies in the spec.

But what are we talking about here? Worst case scenario is a minor increase in instructions that hardly contribute to the overal time spent in the code. Best case scenario: No impact whatsover - which for signed shifts would be the outcome on any relevant platform.

Talk about shooting oneself in the foot...

kb1 · March 22, 2016

Graf Zahl said:
Actually, just saying that a method call just acts like a regular function with the class pointer being the first argument would resolve all issues here.

And if the standard defined name mangling we had perfect interoperabilty instead of the current mess which gets propagated by an ignorant committee.

It's the age old fear that defining a spec too strictly may produce inefficient code. And forgetting in the process that NOT defining such things will only help create code that is even less efficient because it needs to work around the deficiencies in the spec.

But what are we talking about here? Worst case scenario is a minor increase in instructions that hardly contribute to the overal time spent in the code. Best case scenario: No impact whatsover - which for signed shifts would be the outcome on any relevant platform.

Talk about shooting oneself in the foot...

Yeah, sure hope my TRS-80 doesn't slow down when I recompile my 35-year-old heart monitor...no, wait, the Z-80 had all the proper logical and arithmetic shifts.

I guess it's ok to deprecate the 'unsafe' string routines, and replace them with slower versions, cause 'safety'.

Yay spec and committee!
Suck it down, programmers!

Blzut3 · March 22, 2016

Graf Zahl said:
Actually, just saying that a method call just acts like a regular function with the class pointer being the first argument would resolve all issues here.

And if the standard defined name mangling we had perfect interoperabilty instead of the current mess which gets propagated by an ignorant committee.

This is the point where I know you're deliberately ignoring evidence to the contrary. (The three sentences I wrote in this thread following the one you quoted.) Defining an ABI in the standard would not bring any of the benefit that you think it would since compilers are able to interoperate right now. This doesn't show on Windows where Visual C++ keeps their ABI private apparently, but elsewhere this issue doesn't appear to exist.

Given what I heard in IRC, I think it's time to step away from this thread. There doesn't seem to be anything more productive to say.

Jon · March 22, 2016

I think it's reasonable for the C/C++ standards committees to be concerned about backwards compatibility. After all, there are a lot of people only using C/C++ for precisely that reason. For people who don't have that problem, there are many other (and perhaps better) options available now (than C/C++), and have been for a very long time.

Graf Zahl · March 22, 2016

Blzut3 said:
This doesn't show on Windows where Visual C++ keeps their ABI private apparently, but elsewhere this issue doesn't appear to exist.

Yeah, right. The committee doesn't care because it's just Windows? Figures somehow...

Well, if the standard said that a class method has to use the same calling convention as a plain function, i.e. the compiler has to translate

myclass->method(blah, blah, blah)

to

method(myclass, blah, blah, blah)

and the problem would be solved. Note that this doesn't impose any specifics about the calling convention, all it says is that the compiler may not invent some special handling here (which MSVC clearly did for the same bogus concept of efficiency that led to the undefined-ness of the bitshift operators.)

And making the name mangling part of the standard would ensure that output from different compilers would be compatible.

I am really disappointed about your attitude here which is typical committee-speak that tries to justify why things DON'T get done that really should be done.

Of course this won't solve the problems.

vadrig4r · March 22, 2016

Jon said:
For people who don't have that problem, there are many other (and perhaps better) options available now (than C/C++), and have been for a very long time.

Such as?

Gez · March 22, 2016

vadrig4r said:
Such as?

D

vadrig4r · March 22, 2016

Gez said:
D

Thanks, have some reading to do.

chungy · March 22, 2016

vadrig4r said:
Such as?
Gez said:
D

And Rust. Or if you want something not quite as low-level as these, Go is pretty nice.

RestlessRodent · March 23, 2016

Graf Zahl said:
It's the age old fear that defining a spec too strictly may produce inefficient code. And forgetting in the process that NOT defining such things will only help create code that is even less efficient because it needs to work around the deficiencies in the spec.

I agree with this statement, you have for example Hotspot which can do so many optimizations because the only thing one needs to worry about is if the program still has the same side effects. And there is no worry (apart from JNI and JIT bugs) about programs breaking randomly depending on which optimizations are being used.

Graf Zahl said:
myclass->method(blah, blah, blah)

to

method(myclass, blah, blah, blah)

This is how Java does it, you can think of instance methods being the same as static methods except with an instance of the class as the first argument.

Jon · March 23, 2016

chungy said:
vadrig4r said:
Such as?
And Rust. Or if you want something not quite as low-level as these, Go is pretty nice.

And to be honest, C# and Java too.

sheridan · March 23, 2016

This thread in a nutshell is why I hate C++. There's no consensus about what the language itself should be anymore, so the standards committee tries to take some agnostic approach to please everybody and misses the point of software development completely. Is the goal to enable programmers to do their jobs efficiently or is it just to ensure that nobody ever stops using C++?

At least C knew what it wanted to be, even if certain parts of it were pointlessly dangerous (strcpy() and functions like it that copy unspecified amounts of data were obviously never a good idea).

And to be honest, C# and Java too.

Java sucks phenomenally for most game development (mostly in performance), but in all other ways it's a pretty good language with some nice features to boot.

I've never felt encouraged to explore C# for its heavy dependence on the .NET interface. Boo Microsoft.

Honestly, as a game developer the only thing that's keeping me away from using D is the fact that Steamworks (and some other platform specific APIs) are C++ only. That didn't stop us from making a wrapper library for C, though...

Gez · March 23, 2016

sheridan said:
Java sucks phenomenally for most game development (mostly in performance), but in all other ways it's a pretty good language with some nice features to boot.

It's apparently a terrible language for data manipulation, if we judge by the hoops Maes had to jump through in order to get Mocha Doom to do stuff like read WAD records and make the renderer work to generate an image that can then be displayed.

Graf Zahl · March 23, 2016

Gez said:
It's apparently a terrible language for data manipulation,

Not just apparently. It actually is.

Java does not have any usable value semantics for structs, which either necessitates some excessice micro-allocations or refactoring to use more heap friendly methods (like using arrays) which of course completely negates the point of a high level language.

If that wasn't the case the language wouldn't really be that bad (that is, of course, if it could be uncoupled from its inane runtime environment...)

For me the sweet spot would be C++ with all that undefined nonsense and the C-induced cruft (like having to #include headers) being removed.

Jon · March 23, 2016

sheridan said:
Java sucks phenomenally for most game development (mostly in performance), but in all other ways it's a pretty good language with some nice features to boot.

I'd be inclined disagree, citing Minecraft and a bunch of Android games; but Minecraft is not famous for being high-performant, and I haven't written any games in Java so I'm not really speaking from a position of experience.

I've never felt encouraged to explore C# for its heavy dependence on the .NET interface. Boo Microsoft.

I'd suggest looking at Mono here, but Xamarin got bought by MS anyway...

I played through Terraria on Linux a few years back, and that was possible because it was written in C# (MS games studio thing) and a 3rd party managed to stub enough that it could be run largely unmodified on Linux, despite never being intended to and not tested. I was pretty amazed.

Sign In

Chocolate Strife: A_FireSigil

Recommended Posts

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in