Monday, September 1, 2014

Java annoyances

So I've wanted to learn java for years and am kinda forcing myself to do the desktop portion of the Williams IC Tester in java.

Quick take: C# is a better language in pretty much every way (which makes sense since MSFT shamelessly copied java and improved upon it).  And in some ways, C++ is better than java, which is why I am writing this blog post.

I discovered (re-discovered) that java does not have unsigned data types.  Everything is signed.  If you google for information about this, you will find java apologists demanding people to justify why unsigned data types are needed.  I found this attitude somewhat absurd.

Being an emulation author and a machine language enthusiast, unsigned data types are a no-brainer.  The benefit of unsigned types is that the maximum size of the type in question is doubled from the signed version (in the positive direction).  This benefit is something I take advantage of so often that I take it for granted.  Here is a quick example.

An unsigned byte has a range of 0-255, while a signed byte is -128 to 127.  I work with bytes a lot, by default, I expect to have a max upper range of 255.  I've expected this for years (many many years).  In fact, I can't remember the last time where I willingly chose to use a signed byte for any meaningful work.  If I am going to use a signed number for any purpose, I will almost always use whatever the native CPU is optimized for which means on a 32-bit CPU, I would use a 32-bit signed integer, on a 64-bit CPU, I would use a 64-bit signed integer, etc.  I can't think of any practical use for a signed byte when it's so common/easy to use an 'int'.  So the fact that java only supports signed bytes is a major blunder in the design, IMO.

Now, you may be saying that I can just ignore the sign for many operations and I will still get the same result.  This is somewhat true, or rather, would be somewhat true, if java didn't get in my way.

In C++, I am used to doing something like this:

uint8_t u8 = 0xFF; // C++ assigns this to have a value of 255
int i = u8;

^ - I would _always_ expect i to have a value of 0xFF after performing this operation.  It is a no-brainer.

However, to my dismay, I discovered that java does the wrong thing here:

byte u8 = 0xFF; // java assigns this to have a -1 value, even though it has a datatype called 'byte' hur hur
int i = u8; // now i is set to -1 also (0xFFFFFFFF) instead of 255 (0xFF).

To workaround this poor language design, one has to do this:

byte u8 = 0xFF;
int i = u8 & 0xFF; // hurrr hurrrrrr

Now it works correctly because I manually hacked the value to have the proper sign.  Nevermind that performing an extra AND for no practical reason wastes CPU cycles (admittedly not that many) that wouldn't have to be wasted if I could simply tell the language that u8 was supposed to be unsigned and to treat it accordingly.

What I've got from reading about java is that if you want to deal with unsigned types, you need to use a signed type that is bigger to spoof the unsigned type.  So if you want to represent an unsigned byte, you'd use a short, if you wanted to represent an unsigned short, you'd use an int, if you wanted to represent an unsigned int, then you'd have to go crazy and use a long.  It's really pathetic for any "serious" language to have to require this kind of hackery.

This doesn't mean that I've given up on java.  After all, C# is proprietary/MSFT so it's not really available on other platforms (don't get me started on mono).  So java still has its place, but I am not going to be fleeing C++ any time soon since java hasn't given me a great reason to do so.

No comments:

Post a Comment