Same Stupidity, Different Datatype

I had an annoying surprise recently.  I discovered that gcc has decided that long longs are 64 bits- on both 32- and 64-bit platforms.  I discovered this when I wanted to do a double-word multiply.  On 32-bit gcc this is easy- to multiply two 32-bit unsigned longs and get the 64-bit result, you simply cast one of the two to be long long and multiply away.  This is often incredibly usefull.  And I had thought that it was portable to all gcc platforms- only now I find out it isn’t.  This is especially annoying as, at least on 32 bit platforms, multiplying a long long by a long produces optimal code.  The x86 has a single, fast, instruction to multiply two 32-bit quantities to produce a 64-bit quantity.  In fact, this is the standard multiply instruction- the generated code just ignores the upper 32 bits when only a long result is desired.  I now have no easy way to access the capability. 

But this unpleasant discovery has gotten me thinking about integers and their role in programming- and what constraints a new programming language should hold to.

I shouldn’t have been surprised.  The very existance of long long demonstrates a fundamental problem.  What role does long long play?  By this I mean, what uses can and should long long be put to?  The use I was putting long long to was as a double word, allowing me to multiply two large longs without losing information.  But an equally valid (or invalid)

There are two.  One is double-word arithmetic.  Which comes at a cost, I might add.  While “simple” operations- like addition, subtraction, shifting, xor, etc.- and word by double word multiplication are easy to implement, several other operations are not.  For example, while dividing a double word by a word where the result fits in a single word can be done in a single instruction on the x86, generic double word divided by double word, or double word divided by single word when the result doesn’t fit into a single word can not be.  Nor can double word multiplied by another double word- even if you’re only interested in the “low” double word result, and don’t need the full quadword.  In this use, if longs are 64 bits, you want long longs to be 128 bits.

The other use is to allow access to “large” (aka 64-bit) address spaces.  Not just pointers and indexs and data sizes, but also large file system address spaces (aka files) via llseek.  From this perspective, having long longs be exactly 64 bits across all platforms is usefull.  It’ll be decades yet before even hard disks on the largest servers become large enough to start pushing 64-bit address spaces.  Remember that 64 bits puts us solidly into exabytes (millions of terabytes).  On 64-bit systems, llseek should become the same as lseek.

 But thinking about llseek and lseek makes something clear- we’ve been down this path before.  With lseek.  Originally, the file operations had were open, close, read, write, and seek.  Originally, seek took an integer- which was assumed to be 16 bit.  Because there was no long- there were ints (16 bit) and chars (8 bit).  When Unix was ported to 32 bits, they kludged the system.  They introduced a new integer type, long (well, and short too at that point), and changed seek to lseek- long seek.  In other words, in going from 16 to 32 bits, Unix did what I (rightly, I still maintain) vilify the C99 spec committee for doing.

But switching from 32 to 64 bits will cause problems anyways.  As an example, take Java- which is going to have a huge 32-bit problem, due to their use of integers for array indexes.  And Java takes the step of explicitly defining ints to be 32 bit 2′s complement signed integers.  C# probably has the same problem, and both of them inherited it from C.  How many C programs use integers as index subscripts?  If I recall correctly even the vaunted K&R uses integers as array indexes.  When we go to 64 bits, all of this stuff breaks.

Ocaml has this problem as well.  A coworker wrote a bit of code to change an arrays size.  I was going to post a code snippet, but wordpress keeps “fixing” my HTML formatting.  I hate programs which keep assuming that they’re smarter than I am, because they are invariably wrong.  Anyways, the core code snippet was this: new_array.(i) <- old_array.((i * old_array_length) / new_array_length).  Both indexes are, it’s easy for even a compiler to prove, small enough to fit into a single word- but the intermediate result (i * old_array_length) is not, given long enough arrays.  Once array lengths start getting longer than the square root of the word size, this intermediate result will overflow.  Nor will reordering the operations help- since i is always smaller than new_array_length (it’s an index into this array, after all), computing i/new_array_length will always result in 0.  The multiplication must come first.  His solution was to use Ocaml’s 64-bit type (Int64) to hold this intermediate result.  Which fixes the problem on 32-bit platforms, but not on 64-bit platforms when arrays start getting larger than 4G elements long.  Ocaml has fewer integer types, and thus runs into this problem less often- but it still has too many integer types, and still runs into this problem much too often.

All of this leads to my conclusion that the main point of having more than one type of integer in code intended to be portable is so that the programmer can pick the wrong one.  For code actually banging on hardware- OSs, device drivers, etc.- you need specific sized words.  This register is 16 btis, that one is 32, etc.  Such code will pay the price for portability in this case- but the portability problems this causes will be drowned out by much larger portability problems.  As a general rule, hardware-specific code has strictly limited portability by it’s very nature. For general purpose, “user level code”, what use is there for multiple different integer types?  Holding an integer in a single word is an optimization, one that should be applied automatically by the compiler.  How large should an integer be?  As large as necessary.   Or, more precisely, as small as possible.

 The first interesting result of this is that there is a substantial difference now between compile time static type checking and run time dynamic type checking.  Or rather, this difference is more pronounced.  Even if the code knows it’s an integer at a given point, it can still be one of at least two different types (single word and multiword), and probably one of multiple different types (half word, word, double word, multi-word).  Which means an integer operation in a dynamic run-time type checking system requires at least one, if not several, branches to determine which type of integer operation is needed- from the CPU’s perspective, this is a drastic difference between multiplying two single word integers and multiplying two multi-word integers.  A static type system will, in most cases, be able to prove that the integer is of a given type, and apply the appropriate optimizations.  Of course, this requires an advanced type system (more advanced than Ocaml’s- which isn’t hard, Ocaml is state of the art, for 20 years ago.  It just shines in comparison to languages still stuck in the 70′s).

The one downside to this idea is that I’m not sure dependent types (which is the general case of the type system knowing how big an integer is, or at least could be) can be implemented without type annotations- in other words, it might be a choice of type inference vr.s integer optimizations.  If this is the case, then I would argue that integer optimizations should win.  Type inference is a convience- performance and correctness are not.  Actually, allow me to edit this a bit.  Full dependent types not only gives you the size of integers, it also gives you the size of data structures as well.  This allows you to do neat tricks like eliminating bounds checking.  This full level of information may require type annotations- I’m not sure the lesser level of information you need simply to size integers requires type annotations to be supplied by the programmer.  You can always start with the assumption that any count of memory elements fits into a single word.  And, of course, you know the sizes of constant data structures and constant values.  These three starting points may be sufficient to allow optimization in most cases without programmer supplied type annotations.  I’ll have to play with this a bit to determine that.

The other interesting issue this raises is that the format of an integer is no longer known.  Which demonstrates that integers are used in two different roles.  One role is that of an integer, with the standard mathematic operators of add, subtract, negation, multiply, divide (truncate to zero) and modulo.  The stuff that does not depend upon the format or structure of the integer.  The other role is that of an array of bits, where the standard operators are shift, complement (bitwise negation), bitwise and, bitwise or, bitwise xor, get or set or clear a given bit, and maybe bit blit (copying sequences of bits).  Using integers as bit arrays requires you to know the format and structure of an integer- is it two’s complement, one’s complement, or sign magnitude?  Signed or unsigned?  How many bits does it hold?  Does shifting right do sign extension or not?  I really think that these two datatypes should be different- that we shouldn’t use the same type for both.  This frees up integers to use whatever representation is most efficient, makes the bit array operations easier to define precisely and portably,  makes the type system’s job easier, and eliminates a cause of portability problems (also known as bugs).  The only code I know of that mixes both roles, and which isn’t some seriously kludgy attempted optimization, is cryptography.  Cryptographic routines, however, are probably best implemented in a CPU-optimized, and therefor CPU-specific, manner.

Another comment I’d like to make is that the above logic cannot be used in regards to floating point numbers.  Finite integers, by definition, can be represented with a finite number of bits.  And the largest numbers you’ll generally see in most programs are of the order of 8K bits, or 1K bytes.  And the vast majority (95%+) will easily fit into a single machine word, two at the most.  It is not at all unusual, however, for irrational numbers to show up in floating point programs- all you need is to ask what the euclidean length (norm-2) of the vector [1; 0; 1] is (hint: square root of 2).  Pi and e are also irrationals that show up all the time.  And thus, some sort of rounding and approximation needs to be used simply to keep the amount of memory needed by each number finite.  You can not simply declare that a floating point number uses as many bits as needed, because no amount of bits is sufficient.

 

No related posts.

This entry was posted in Classic, To Be Categorized and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
  • Pingback: The Cheap Sitcom Clip Scene Blog Post | Enfranchised Mind

  • http://chris.iluo.net/ Chris

    If you don’t already, why don’t you use typedefs?

    typedef long int32_t;

    #ifdef long long is 64 bits
    typedef long long int64_t;
    #else
    typedef long long long int64_t; // or whatever
    #endif

    uint32_t a = 0xffffffff;
    uint32_t b = 0xffffffff;
    int64_t result = MultiplyLargeNumbers(a, b);

  • Categories