6

Currently I am implementing an equation (2^A)[X + Y*(2^B)] in one of my applications.

The issue is with the overflow of 32 bit value and I cannot use 64 bit data type.

Suppose when B = 3 and Y = 805306367, it overflows 32bit value, but when X = -2147483648, the result comes backs to 32 bit range.
So I want to store the result of (Y*2^B). Can anyone suggest some solution for this.... A and B are having value from -15 to 15 and X,Y can have values from 2147483647..-2147483648.
Output can range from 0...4294967295.

2
  • 2
    If A = B = 15 and X = Y = 2147483647 then the result is far greater than 4294967295. Are there other restrictions that bound the output in the 0...4294967295 range? Commented Jun 12, 2011 at 12:56
  • In your equation, is ^ xor, or power? What do the square brackets represent? Commented Feb 3, 2017 at 0:14

4 Answers 4

6

If the number is too big for a 32 bit variable, then you either use more bits (either by storing in a bigger variable, or using multiple variables) or you give up precision and store it in a float. Since Y can be MAX_INT, by definition you can't multiply it by a number greater than 1 and still have it fit in a 32 bit int.

Sign up to request clarification or add additional context in comments.

Comments

1

I'd use loop, instead of multiplication, in this case. Something like this:

int newX = X;
int poweredB = ( 1 << B ); // 2^B
for( int i = 0; i < poweredB ; ++i )
{
    newX += Y; // or change X directly, if you will not need it later.
}
int result = ( 1 << A ) * newX;

But note : this will work only for some situations - only if you have the guarantee, that this result will not overflow. In your case, when Y is large positive and X is large negative number ("large" - argh, this is too subjective), this will definitely work. But if X is large positive and Y is large positive - there will be overflow again. And not only in this case, but with many others.

1 Comment

+1 for showing that no multiplication is needed, just shifts.
1

Based on the values for A and B in the assignment I suppose the expected solution would involve this:

  • the following are best done unsigned so store the signs for X and Y and operate on their absolute value

  • Store X and Y in two variables each, one holding the high 16 bits the other holding the low bits something like
    int hiY = Y & 0xFFFF0000;

    int loY = Y & 0x0000FFFF;

  • Shift the high parts right so that all the variables have the high bits 0

  • Y*(2^B) is actually a shift of Y to the left by B bits. It is equivalent to shifting the high and low parts by B bits and, since you've shifted the high part, both operations will fit inside their 32 bit integer

  • Process X similarly in the high and low parts

  • Keeping track of the signs of X and Y calculate X + Y*(2^B) for the high and low parts

  • Again shift both the high and low results by A bits

  • Join the high and low parts into the final result

Comments

1

If you can't use 64-bits because your local C does not support them rather than some other overriding reason, you might consider The GNU Multiple Precision Arithmetic Library at http://gmplib.org/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.