Ruby - sort array of hashes values (string) based on array order

ruby sort array of arrays
ruby sort_by
sort array of hashes ruby
ruby sort hash by value
ruby custom sort
ruby sort descending
ruby sort by date
ruby ordered hash

I have an array of hashes in the format shown below and I am attempting to sort the :book key of the hash based on a separate array. The order is not alphabetical and for my use case it cannot be alphabetical.

I need to sort based on the following array:

array = ['Matthew', 'Mark', 'Acts', '1John']

Note that I've seen several solutions that leverage Array#index (such as Sorting an Array of hashes based on an Array of sorted values) to perform a custom sort but that will not work with strings.

I've tried various combinations of sorting with Array#sort and Array#sort_by but they don't seem to accept a custom order. What am I missing? Thank you in advance for your help!

Array of Hashes

[{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]

Here's an example

arr = [{a: 1}, {a: 3}, {a: 2}] 

order = [2,1,3]  

arr.sort { |a,b| order.index(a[:a]) <=> order.index(b[:a]) }                                           
# => [{:a=>2}, {:a=>1}, {:a=>3}]  

In your case it would be

order = ['Matthew', 'Mark', 'Acts', '1John']
result = list_of_hashes.sort do |a,b|
  order.index(a[:name]) <=> order.index(b[:name])
end

There are two important concepts here:

  1. Using Array#index to find where in an array an element is found
  2. the 'spaceship operator' <=> which is how Array#sort works - see What is the Ruby <=> (spaceship) operator?

You can make it slightly faster by indexing the list of elements you want to order by:

order_with_index = order.each.with_object.with_index({}) do |(elem, memo), idx|
  memo[elem] = idx
end

then instead of order.index(<name>) use order_with_index[<name>]

The Ruby sort method helps you order your data (arrays & hashes) in then taking a look at sort_by for advanced sorting (by multiple values) & more. Sort by string length; Sort by string contents; Sort by wether a number is even or odd the sort method by itself for the default sorting behaviour (sort based on == operator)  How to Sort Hashes in Ruby. You are not limited to sorting arrays, you can also sort a hash. Example: hash = {coconut: 200, orange: 50, bacon: 100} hash.sort_by(&:last) # [[:orange, 50], [:bacon, 100], [:coconut, 200]] This will sort by value, but notice something interesting here, what you get back is not a hash. You get a multi-dimensional array when sorting a hash.

As can be seen from the documentation, Array#index indeed does work for strings (it's even the provided example), so this would work:

books.sort_by { |b| array.index(b[:book]) }

But instead of repeatedly searching through array, you can just determine the order once and then look it up:

order = array.each.with_index.to_h
#=> { "Matthew" => 0, "Mark" => 1, "Acts" => 2, "1John" => 3 }
books.sort_by { |b| order[b[:book]] }

Ruby has two handy methods that can be used for sorting arrays .sort and a set of keys generated by mapping the values of the array through the given block. expression to identify numbers in the string and sort by numbers accordingly: Make instance variable accessible through hash in Ruby. ruby-on-rails,ruby,ruby-on-rails-4,activerecord. It's not "through Hash", it's "array access" operator. To implement it, you need to define methods: def [](*keys) # Define here end def []=(*keys, value) # Define here end Of course, if you won't be using multiple keys to access an element, you're fine with using just key instead of

Since you know the desired order there's no need to sort the array. Here's one way you could do that. (I've called your array of hashes bible.)

bible.group_by { |h| h[:book] }.values_at(*array).flatten
  #=> [{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #    {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"},
  #    {:book=>"Mark", :chapter=>"6", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"17", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"}] 

Since Enumerable#group_by, Hash#values_at and Array#flatten each require just one pass through the array bible this may be faster than sorting when bible is large.

Here are the steps.

h = bible.group_by { |h| h[:book] }
  #=> {"Matthew"=>[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #                {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    "Mark"   =>[{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    "1John"  =>[{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #                {:book=>"1John", :chapter=>"1", :section=>"new_testament"}],
  #    "Acts"   =>[{:book=>"Acts", :chapter=>"9", :section=>"new_testament"}, 
  #                {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}]
  #   } 

a = h.values_at(*array)
  #=> h.values_at('Matthew', 'Mark', 'Acts', '1John')
  #=> [[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #     {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    [{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    [{:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #     {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}],
  #    [{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #     {:book=>"1John", :chapter=>"1", :section=>"new_testament"}]] 

Lastly, a.flatten returns the array shown earlier.

Let's do a benchmark.

require 'fruity'

@bible = [
  {:book=>"Matthew",
   :chapter=>"4",
   :section=>"new_testament"},
  {:book=>"Matthew",
   :chapter=>"22",
   :section=>"new_testament"},
  {:book=>"Mark",
   :chapter=>"6",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"9",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"17",
   :section=>"new_testament"}]

@order = ['Matthew', 'Mark', 'Acts', '1John']

def bench_em(n)
  arr = (@bible*((n/@bible.size.to_f).ceil))[0,n].shuffle
  puts "arr contains #{n} elements"
  compare do 
    _sort       { arr.sort { |h1,h2| @order.index(h1[:book]) <=>
                  @order.index(h2[:book]) }.size }
    _sort_by    { arr.sort_by { |h| @order.find_index(h[:book]) }.size }
    _sort_by_with_hash {ord=@order.each.with_index.to_h;
                        arr.sort_by {|b| ord[b[:book]]}.size}    
    _values_at  { arr.group_by { |h| h[:book] }.values_at(*@order).flatten.size }
  end
end

@maxpleaner, @ChaitanyaKale and @Michael Kohl contributed _sort, _sort_by, and sort_by_with_hash, respectively.

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _values_at
_values_at is faster than _sort by 2x ± 1.0

bench_em  1_000
arr contains 1000 elements
Running each test 16 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is faster than _sort_by by 10.000000000000009% ± 10.0%
_sort_by is faster than _sort by 2x ± 0.1

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

Here's a second run.

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em  1_000
arr contains 1000 elements
Running each test 8 times. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2.2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 1.0

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

The Ruby sort method works by comparing elements of a collection the last element from the list at each pass (because it's the largest value). Sorting an array in descending order with sort can just sort that array, and then join the elements back into a string. How to sort an array of hashes in ruby. The Ruby sort method works by comparing elements of a collection through the spaceship operator, and using the quicksort algorithm. The Ruby sort method works by comparing elements of a collection through the spaceship operator, and using the quicksort algorithm. Learn.

As the description of Array#sort_by accepts a block. The block should return -1, 0, or +1 depending on the comparison between a and b. You can use find_index on the array to do such comparison.

array_of_hashes.sort_by {|a| array.find_index(a[:book]) } should do the trick.

Using Ruby's #sort and #sort_by methods on arrays and hashes In order to include the Enumerable module and use its methods, uses the base Enumerable versions of #sort and #sort_by —you just have to If you have a mixture of numbers and strings in your array, #sort will raise an ArgumentError . Next, let’s look at how to sort the values of an array. Sorting an Array. Sorting data is a common practice. You may need to alphabetize a list of names or sort numbers from smallest to largest. Ruby arrays have a reverse method which can reverse the order of the elements in an array.

Your mistake is to think that you are sorting. But, in fact, you are not, you already have the order, you just need to place the elements. I'm not proposing a compact or optimal solution, but a simple solution. First convert your large array into a hash indexed by the :book key (which should have been your first data structure), and then just use map:

array = ['Matthew', 'Mark', 'Acts', '1John']
elements = [{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]
by_name = {}
for e in elements
  by_name[e[:book]] = e
end
final = array.map { |x| by_name[x] }

You can also use the Hash#sort method to get a new sorted Array of pairs: at the value my_hash[key] returns to determine the sorting order. We can, however, build other data structures that represent sorted versions of the data within a hash. An array, for example. Let's say we want to get a list of the ages in order: people.values.sort # => [18, 23, 54] This gives us the values but you might want to have both the values and their associated keys in a particular order.

We also cover how to iterate over a hash and how to compare Array vs Hash in Ruby. All key-value pairs in a hash are surrounded by curly braces {} and comma As of Ruby 1.9, hashes also maintain order, but usually ordered items are D. There's an array of strings, and we're trying to get the string keys out of the  String. To sort a string, we must first convert it into characters with split(). Then we join those characters back together into a string. We implement a sort_string() method. Split: The sort_string method relies on the split method. With an empty string delimiter, it separates the characters. Split

Sort with blocks, sort in descending order and sort in-place. Fortunately Ruby offers the sort method, available on arrays. It can be customized We sort based on anything. String array. This program creates a string array with 3 characters in it. We can then sort that array of pairs by key or value.Hash: sort. A summary. In the last form, an array of the given size is created. Each element in this array is created by passing the element’s index to the given block and storing the return value. Array. new (3) {| index | index ** 2} # => [0, 1, 4] Common gotchas ¶ ↑ When sending the second parameter, the same object will be used as the value for all the array

For example, the array below contains an Integer, a String and a Float: This method is safe to use with mutable objects such as hashes, strings or other arrays: A useful method if you need to remove nil values from an array is compact: is reverse_each which will iterate over the elements in the array in reverse order. Hashes are unsorted objects because of the way in which they are stored internally. If you want to access a Hash in a sorted manner by key, you need to use an Array as an indexing mechanism as is shown above.

Comments
  • Ah this one is better than my solution as sort_by is expensive
  • Max, thank you so much for the quick reply! I'll try this out. I didn't think index could be used to compare string order. This is very elegant--really appreciate your time.
  • @ChaitanyaKale, sort_by is cheap, not expensive. That's because the construction of a hash that maps elements to the sorting criterion is done only once, prior to the sorting operation. By contrast, with <=> indices must be calculated twice for each pairwise comparison, which is much slower than performing two hash lookups.
  • Ah Thanks @CarySwoveland! I read ruby-doc.org/core-2.2.0/Enumerable.html#method-i-sort_by which mentions it being expensive "The current implementation of sort_by generates an array of tuples containing the original collection element and the mapped value. This makes sort_by fairly expensive when the keysets are simple."
  • @ChaitanyaKale, I'm amazed that sort_by uses an array of two-element arrays. I just assumed it would be a hash for fast lookups. Even so, sort_by is much faster than sort in many situations.
  • I somehow missed your answer earlier. I expected the hash to speed up sort_by quite a lot, but my benchmarks suggest that it doesn't.
  • The array is probably too small for that to make a noticeable difference. But for larger arrays I'd certainly use this approach.
  • Ah, I missed your update with all the BM results above. Seems like even for biggish arrays the difference between my and your solution is negligible. Maybe the implementation of sort_by with a two element array offsets the benefit of using a hash for generating the external order, so I'd say one should use whatever code makes semantically the most sense. Thanks for the benchmarks!
  • Apologies Michael, I am just seeing this now. Thank you so much for taking the time to put this together. This is definitely a more readable solution and I have upvoted it. It is nice to understand how the backend sorting works but I will definitely go this route in the future.
  • @KurtW Don't worry, it seems that you learned a lot from this answer and that's what SO is about.
  • Cary, so sorry I didn't catch this yesterday. I was so glad to have a solution that I didn't follow up. This is really good analysis and my @bible array (or whatever it will be called) is quite large. If I get this thing off the ground, I may very well be comparing performance and use your solution in my final code. Thank you!!
  • Thank you for your time Chaitanya!
  • Hmmm. I didn't see there were more than one entry with the same name, forget this.