1
array_of_hash1 = 
    [{"Date" => "2019-07-01", "Country" => "US", "Email" => "[email protected]", "Price" => "11.224323", "Tax" => "8.55443"},
     {"Date" => "2019-07-01", "Country" => "US", "Email" => "[email protected]", "Price" => "16.664323", "Tax" => "6.55443"},
     {"Date" => "2019-06-30", "Country" => "US", "Email" => "[email protected]", "Price" => "17.854323", "Tax" => "7.12343"},
     {"Date" => "2019-07-02", "Country" => "UK", "Email" => "[email protected]", "Price" => "14.224323", "Tax" => "4.32443"}]

array_of_hash2 = 
    [{"Date" => "2019-07-01", "Name" => "John", "Price" => "11.3442223", "Tax" => "3.44343"},
     {"Date" => "2019-07-01", "Name" => "Jack", "Price" => "14.332323", "Tax" => "5.41143"},
    {"Date" => "2019-07-02", "Name" => "Sam", "Price" => "10.2223443", "Tax" => "2.344552"}]

Above are my inputs in array of hashes

  1. Add Price and Tax in array_of_hash1 by Date
  2. Add Price and Tax in array_of_hash2 by Date
  3. Subtract as (i) - (ii)
  4. If there are no values in array_of_hash2 by Date compare with array_of_hash1. Then take values from array_of_hash1 only.

Here is my expected output.

Expected Output:

[{"Date" => "2019-07-01", "Country" => "US", "Email" => "[email protected]", "Price" => "2.2121007000000006", "Tax" => "6.254"}, 
 {"Date" => "2019-07-02", "Country" => "UK", "Email" => "[email protected]", "Price" => "4.0019787", "Tax" => "1.9798780000000002"},
{"Date" => "2019-06-30", "Country" => "US", "Email" => "[email protected]", "Price" => "17.854323", "Tax" => "7.12343"}]

2 Answers 2

2

Using two helper methods for a DRY code and using Enumerable#sum, Enumerable#group_by, Hash#merge, Hash#transform_values and also other methods you can find in the documentation.

I'm using also Object#then here.

def sum_price_tax(ary)
  ary.first.merge ary.then { |ary| { "Price" => ary.sum { |h| h["Price"].to_f }, "Tax" => ary.sum { |h| h["Tax"].to_f} }  }
end

def group_and_sum(array_of_hash)
  array_of_hash.group_by { |h| h["Date"] }.transform_values { |ary| sum_price_tax(ary) }
end

After the methods are defined, you can do:

a1 = group_and_sum(array_of_hash1)
a2 = group_and_sum(array_of_hash2)
a1.map { |k, v| v.merge(a2[k] || {}) { |h, old_val, new_val| old_val.is_a?(Float) ? old_val - new_val : old_val  } }

#=> [{"Date"=>"2019-07-01", "Country"=>"US", "Email"=>"[email protected]", "Price"=>2.2121007000000006, "Tax"=>6.254, "Name"=>"John"}, {"Date"=>"2019-06-30", "Country"=>"US", "Email"=>"[email protected]", "Price"=>17.854323, "Tax"=>7.12343}, {"Date"=>"2019-07-02", "Country"=>"UK", "Email"=>"[email protected]", "Price"=>4.0019787, "Tax"=>1.9798780000000002, "Name"=>"Sam"}]

In this way also the "Name" is present.


One way you could get rid of "Name" is using Object#tap and Hash#delete:

a1.map { |k, v| v.merge(a2[k] || {}) { |h, old_val, new_val| old_val.is_a?(Float) ? old_val - new_val : old_val  }.tap { |h| h.delete("Name") } }

#=> [{"Date"=>"2019-07-01", "Country"=>"US", "Email"=>"[email protected]", "Price"=>2.2121007000000006, "Tax"=>6.254}, {"Date"=>"2019-06-30", "Country"=>"US", "Email"=>"[email protected]", "Price"=>17.854323, "Tax"=>7.12343}, {"Date"=>"2019-07-02", "Country"=>"UK", "Email"=>"[email protected]", "Price"=>4.0019787, "Tax"=>1.9798780000000002}]
Sign up to request clarification or add additional context in comments.

7 Comments

I need final result should have Date, Price, Tax columns. All other columns are not needed.
I see some Date values are nil in the final result
I have used like this to get only Date, Price, Tax columns and fix null values in Date. Is this a better way do this? a1.map { |k,v| v.merge(a2[k] || {}) { |h, old_val, new_val| if old_val.is_a? Float old_val - new_val elsif old_val == new_val old_val end } }.map do |hash| hash.select{|k,v| ("Date" "Price" "Tax").include? k} end
If the inputs are csv string instead of array of hashes. Will it be better and faster to do the above merge in Ruby on Rails?
Let me know you suggestions on the above question
|
1

We are given the following (simplified from the arrays given in the question and with one hash added to arr2):

arr1 = [ 
  {"Date"=>"2019-07-01", "Country"=>"US", "Price"=>"11.22", "Tax"=>"8.55"},
  {"Date"=>"2019-07-01", "Country"=>"US", "Price"=>"16.66", "Tax"=>"6.55"},
  {"Date"=>"2019-06-30", "Country"=>"US", "Price"=>"17.85", "Tax"=>"7.12"},
  {"Date"=>"2019-07-02", "Country"=>"UK", "Price"=>"14.22", "Tax"=>"4.32"}
]

arr2 = [
  {"Date"=>"2019-07-01", "Price"=>"11.34", "Tax"=>"3.44"},
  {"Date"=>"2019-07-01", "Price"=>"14.33", "Tax"=>"5.41"},
  {"Date"=>"2019-07-02", "Price"=>"10.22", "Tax"=>"2.34"},
  {"Date"=>"2019-07-03", "Price"=>"14.67", "Tax"=>"3.14"}
]

We will need a list of dates that are values of "Date" in the hashes in arr1.

dates1 = arr1.map { |g| g["Date"] }.uniq
  #=> ["2019-07-01", "2019-06-30", "2019-07-02"] 

Now convert arr2 to an array of those elements h in arr2 for which h["Date"] is in dates1, with all keys other than "Price" and "Tax" removed from each hash retained, and with the values of those two keys converted to string representations of their values negated:

a2 = arr2.each_with_object([]) do |g,arr| arr <<
  { "Date"=>g["Date"], "Price"=>"-" << g["Price"], "Tax" =>"-" << g["Tax"] } if
    dates1.include?(g["Date"])
end
  #=> [{"Date"=>"2019-07-01", "Price"=>"-11.34", "Tax"=>"-3.44"},
  #    {"Date"=>"2019-07-01", "Price"=>"-14.33", "Tax"=>"-5.41"},
  #    {"Date"=>"2019-07-02", "Price"=>"-10.22", "Tax"=>"-2.34"}] 

We now loop over all elements of arr1 and a2 to create hash with keys the values of "Date", with values of "Price" and "Tax" aggregated. Once that is done we extract the value of the hash that has been constructed.

(arr1 + a2).each_with_object({}) do |g,h|
  h.update(g["Date"]=>g) do |_,merged_hash,hash_to_merge|
    merged_hash.merge(hash_to_merge) do |k,merged_str,str_to_merge| 
      ["Price", "Tax"].include?(k) ? "%.2f" %
        (merged_str.to_f + str_to_merge.to_f) : merged_str
    end
  end
end.values
  #=> [{"Date"=>"2019-07-01", "Country"=>"US", "Price"=>"2.21",  "Tax"=>"6.25"},
  #    {"Date"=>"2019-06-30", "Country"=>"US", "Price"=>"17.85", "Tax"=>"7.12"},
  #    {"Date"=>"2019-07-02", "Country"=>"UK", "Price"=>"4.00",  "Tax"=>"1.98"}] 

In this last step the receiver of values is found to be the hash:

{"2019-07-01"=>{"Date"=>"2019-07-01", "Country"=>"US",
                "Price"=>"2.21", "Tax"=>"6.25"},
 "2019-06-30"=>{"Date"=>"2019-06-30", "Country"=>"US",
                "Price"=>"17.85", "Tax"=>"7.12"},
 "2019-07-02"=>{"Date"=>"2019-07-02", "Country"=>"UK",
                "Price"=>"4.00", "Tax"=>"1.98"}}

Notice that the result would be the same if arr[1]["Country"]=>"Canada". I've assumed that would not be a problem or could not occur.

The last step uses versions of the methods Hash#update (a.k.a. merge!) and Hash#merge that employ a hash to determine the values of keys that are present in both hashes being merged.

The values of the block variables (|_,merged_hash,hash_to_merge| and |k,merged_str,str_to_merge|) are explained in the docs. The first block variable is the common key (_ and k). I've represented the first of these with an underscore to signal to the reader that it is not used in the block calculation (a common convention). The second block variable is the value of the key in the hash being built (merged_hash and merged_str). The third block variable is the value of the key in the hash being merged (merged_hash and str_to_merge).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.