1

I have the following scenario:

value_range = [250.0, 350.0]
precision = 0.01
unique_values = len(np.arange(min(values_range), 
                              max(values_range) + precision, 
                              precision))

This means all values range between 250.0 and 350.0 with a precision of 0.01, giving a potential total of 10001 unique values that the data set can have.

# This is the data I'd like to scale
values_to_scale = np.arange(min(value_range), 
                            max(value_range) + precision, 
                            precision) 

# These are the bins I want to assign to
unique_bins = np.arange(1, unique_values + 1)

You can see in the above example, each value in values_to_scale will map exactly to its corresponding item in the unique_bins array. I.e. a value of 250.0 (values_to_scale[0]) will equal 1.0 (unique_bins[0]) etc.

However, if my values_to_scale array looks like:

values_to_scale = np.array((250.66, 342.02)) 

How can I do the scaling/transformation to get the unique bin value? I.e. 250.66 should equal a value of 66 but how do I obtain this?

NOTE The value_range could equally be between -1 and 1, I'm just looking for a generic way to scale/normalise data between two values.

2
  • have a look at numpy.linspace Commented Jun 19, 2019 at 12:34
  • 1
    That's not what I asked? That gives a list between two values but does not transform/map onto a range. Commented Jun 19, 2019 at 12:51

1 Answer 1

1

You're basically looking for a linear interpolation between min and max:

minv = min(value_range)
maxv = max(value_range)
unique_values = int(((maxv - minv) / precision) + 1)
((values_to_scale - minv) / (maxv + precision - minv) * unique_values).astype(int)
# array([  65, 9202])
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you - this is something similar to what I can up with in the end and answers this question perfectly - as with my other question however, when running this for an array similar to arr = np.random.ranint(0, 12000, size=(40000,30000), dtype=np.uint16) (which is a 2GB array) you get a HUGE memory spike when performing the calculation - on my machine it needs more than 20GB RAM to complete - I'm trying to reduce that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.