7

I expected the running time of Array#shift and Array#unshift both to be Θ(n). Reason being that the machine needs to loop through each array member and assign it to the key left or right to it.

In the case of Array#unshift, assuming that only one value is passed in as an argument and that there are lots of array members, I assumed that the value assignment for array[0] is not having a significant impact on the running time. In other words, when the number of array members is high and the number of variables passed into Array#unshift is low, I expected Array#shift and Array#unshift to have the same running time.

When running benchmarks on Ruby 2.1.2, these assumptions do not hold true. Why?

Code:

require 'benchmark'

GC.disable    

number_of_elements = 25_600_000

a1 =[]
a2 = []
a3 = []
a4 = []
q1 = Queue.new
q2 = Queue.new

puts number_of_elements

number_of_elements.times do
  q1.enq(true)
  q2.enq(true)
  a1 << true
  a2 << true
  a3 << true
  a4 << true
end

number_of_operations = 1

Benchmark.bm do |bm|
  puts "Queue#enq('test')"
  bm.report do    
    number_of_operations.times { q1.enq('test') }
  end

  puts "Queue#deq"
  bm.report do    
    number_of_operations.times { q2.deq }
  end

  puts "Array#shift"
  bm.report do
    number_of_operations.times { a1.shift }
  end 

  puts "Array#unshift"
  bm.report do
    number_of_operations.times { a2.unshift('test') }    
  end 

  puts "Array#pop"
  bm.report do
    number_of_operations.times { a3.pop }
  end

  puts "Array#<<"
  bm.report do
    number_of_operations.times { a4 << 'test' }
  end      
end

Result:

25600000
       user     system      total        real
Queue#enq('test')
   0.000000   0.000000   0.000000 (  0.000006)
Queue#deq
   0.010000   0.020000   0.030000 (  0.029928)
Array#shift
   0.010000   0.020000   0.030000 (  0.032203)
Array#unshift
   0.080000   0.060000   0.140000 (  0.143272)
Array#pop
   0.000000   0.000000   0.000000 (  0.000004)
Array#<<
   0.000000   0.000000   0.000000 (  0.000007)
8
  • What happens if you crank up the number of operations? Commented Jul 3, 2014 at 18:13
  • #[email protected]: 0.016307, #[email protected]: 0.059878, #shift@64m: 0.098583, #shift@128m: 0.344900. #[email protected]: 0.059736, #[email protected]: 0.126382, #unshift@64m: 0.285351, #unshift@128m: 0.993967. By comparing only these ratios, it would seem that #shift runs at Θ(3.4n) and #unshift slightly exponential. Commented Jul 3, 2014 at 18:38
  • 1
    Have you studied the Ruby source code to understand how arrays are implemented and how shift/unshift work? There are all sorts of things that could be going on behind the scenes that will cloud your results. Adding an element to the beginning of an array does not necessarily allocate a single new entry for the array, it might allocate many or it might just move a array-starts-here pointer; similarly, shifting an element might just update a pointer to point to the new index-0 address. Commented Jul 3, 2014 at 18:54
  • I had a look but it's written in C and with zero C experience I struggle to understand it. Commented Jul 3, 2014 at 19:02
  • 1
    your question is pretty useless without mentioning which ruby version you use Commented Jul 3, 2014 at 20:47

1 Answer 1

3

In MRI Ruby 2.1.2, unshift does realloc the array and copy it entirely:

              static VALUE
rb_ary_unshift_m(int argc, VALUE *argv, VALUE ary)
{
    long len = RARRAY_LEN(ary);

    [...]

    ary_ensure_room_for_unshift(ary, argc);
    ary_memcpy(ary, 0, argc, argv);
    ARY_SET_LEN(ary, len + argc);
    return ary;
}

shift apparently does not always do something like that:

              static VALUE
rb_ary_shift_m(int argc, VALUE *argv, VALUE ary)
{
    VALUE result;
    long n;

    [...]

    rb_ary_modify_check(ary);
    result = ary_take_first_or_last(argc, argv, ary, ARY_TAKE_FIRST);
    n = RARRAY_LEN(result);
    if (ARY_SHARED_P(ary)) {
        if (ARY_SHARED_OCCUPIED(ARY_SHARED(ary))) {
            ary_mem_clear(ary, 0, n);
        }
        ARY_INCREASE_PTR(ary, n);
    }
    else {
        RARRAY_PTR_USE(ary, ptr, {
            MEMMOVE(ptr, ptr + n, VALUE, RARRAY_LEN(ary)-n);
        }); /* WB: no new reference */
    }
    ARY_INCREASE_LEN(ary, -n);

    return result;
}
Sign up to request clarification or add additional context in comments.

2 Comments

In some cases, the array has extra space at the front allocated for the purpose of unshift: github.com/ruby/ruby/commit/…
The above is as of 2.0, so this answer is a little misleading. It seems like for array sizes less than 64, this optimization doesn't happen, presumably because O(n) isn't so bad when n < 64

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.