Simultaneous mutable access to arbitrary indices of a large vector that are guaranteed to be disjoint

Related searches
Context

I have a case where multiple threads must update objects stored in a shared vector. However, the vector is very large, and the number of elements to update is relatively small.

Problem

In a minimal example, the set of elements to update can be identified by a (hash-)set containing the indices of elements to update. The code could hence look as follows:

let mut big_vector_of_elements = generate_data_vector();

while has_things_to_do() {
    let indices_to_update = compute_indices();
    indices_to_update.par_iter() // Rayon parallel iteration
       .map(|index| big_vector_of_elements[index].mutate())
       .collect()?;
}

This is obviously disallowed in Rust: big_vector_of_elements cannot be borrowed mutably in multiple threads at the same time. However, wrapping each element in, e.g., a Mutex lock seems unnecessary: this specific case would be safe without explicit synchronization. Since the indices come from a set, they are guaranteed to be distinct. No two iterations in the par_iter touch the same element of the vector.

Restating my question

What would be the best way of writing a program that mutates elements in a vector in parallel, where the synchronization is already taken care of by the selection of indices, but where the compiler does not understand the latter?

A near-optimal solution would be to wrap all elements in big_vector_of_elements in some hypothetical UncontendedMutex lock, which would be a variant of Mutex which is ridiculously fast in the uncontended case, and which may take arbitrarily long when contention occurs (or even panics). Ideally, an UncontendedMutex<T> should also be of the same size and alignment as T, for any T.

Related, but different questions:

Multiple questions can be answered with "use Rayon's parallel iterator", "use chunks_mut", or "use split_at_mut":

  • How do I run parallel threads of computation on a partitioned array?
  • Processing vec in parallel: how to do safely, or without using unstable features?
  • How do I pass disjoint slices from a vector to different threads?
  • Can different threads write to different sections of the same Vec?
  • How to give each CPU core mutable access to a portion of a Vec?

These answers do not seem relevant here, since those solutions imply iterating over the entire big_vector_of_elements, and then for each element figuring out whether anything needs to be changed. Essentially, this means that such a solution would look as follows:

let mut big_vector_of_elements = generate_data_vector();

while has_things_to_do() {
    let indices_to_update = compute_indices();
    for (index, mut element) in big_vector_of_elements.par_iter().enumerate() {
        if indices_to_update.contains(index) {
            element.mutate()?;
        }
    }
}

This solution takes time proportionate to the size of big_vector_of_elements, whereas the first solution loops only over a number of elements proportionate to the size of indices_to_update.

When the compiler can't enforce that mutable references to a slice elements aren't exclusive, Cell is pretty nice.

You can transform a &mut [T] into a &Cell<[T]> using Cell::from_mut, and then a &Cell<[T]> into a &[Cell<T>] using Cell::as_slice_of_cells. All of this is zero-cost: It's just there to guide the type-system.

A &[Cell<T>] is almost like a &[&mut T], but what you can do with the elements is limited to read or replace, you can't get a reference, mutable or not, to the elements themselves. This guarantees that everything is safe, at no dynamic cost.

fn main() {
    use std::cell::Cell;

    let slice: &mut [i32] = &mut [1, 2, 3];
    let cell_slice: &Cell<[i32]> = Cell::from_mut(slice);
    let slice_cell: &[Cell<i32>] = cell_slice.as_slice_of_cells();
    
    let two = &slice_cell[1];
    let another_two = &slice_cell[1];

    println!("This is 2: {:?}", two);
    println!("This is also 2: {:?}", another_two);
    
    two.set(42);
    println!("This is now 42!: {:?}", another_two);
}

Mutable references to separate indices of a vector? : rust, Simultaneous mutable access to arbitrary indices of a large vector that are guaranteed to be disjoint Since the indices come from a set, they are guaranteed to be distinct. How do I pass disjoint slices from a vector to different threads? Efficiently entangling pairs of qubits is essential to fully harness the power of quantum computing. Here, we devise an exact protocol that simultaneously entangles arbitrary pairs of qubits on a

You can sort indices_to_update and extract mutable references by calling split_*_mut.

let len = big_vector_of_elements.len();

while has_things_to_do() {
    let mut tail = big_vector_of_elements.as_mut_slice();

    let mut indices_to_update = compute_indices();
    // I assumed compute_indices() returns unsorted vector
    // to highlight the importance of sorted order
    indices_to_update.sort();

    let mut elems = Vec::new();

    for idx in indices_to_update {
        // cut prefix, so big_vector[idx] will be tail[0]
        tail = tail.split_at_mut(idx - (len - tail.len())).1;

        // extract tail[0]
        let (elem, new_tail) = tail.split_first_mut().unwrap();
        elems.push(elem);

        tail = new_tail;
    }
}

Double check everything in this code; I didn't test it. Then you can call elems.par_iter(...) or whatever.

[PDF] Safe, Flexible Aliasing with Deferred Borrows, Simultaneous mutable access to arbitrary indices of a large vector that are guaranteed to be disjoint. 由半城伤御伤魂 提交于2019-12-01 17:38:45� I am new to Rust, and struggling to deal with all those wrapper types in Rust. I am trying to write code that is semantically equal to the following C code. The code tries to create a big table for

Data structures — list of Rust libraries/crates // Lib.rs, The compiler isn't detailed enough to deduce that vec[0] and vec[1] are disjoint, since custom Index implementations don't guarantee that a != b => &v[a] != cess to the VEF is performed indirectly using Vector Pointer Registers (VPRs). Each Vector Pointer Regis-ter contains four elements which serve as indices into the Vector Element Register file,providing access to four arbitrary elements. VPU — operates on the VPRs. In addition,whenever VPRs are used to access the VEF,they can be in-

You may be looking for a disjoint-set data structure, a form of partitioning defined by sets of indices to elements of a list. A good Rust implementation of this structure would allow you to safely and efficiently traverse and mutate the values of each set in parallel set-wise, since the sets are known to be disjoint.

Luckily, there is the partitions crate, which provides a disjoint-set implementation. Once a PartitionVec is built, each set can be iterated independently using the all_sets_mut() method¹. The following code uses rayon to process three sets of numbers in parallel, each with 2 elements.

use partitions::{partition_vec, partitions_count_expr, PartitionVec};
use rayon::prelude::*;

let mut partition_vec = partition_vec![
    2 => 0, // value 2 in set 0
    4 => 0, // value 4 in set 0
    6 => 1, // value 6 in set 1
    8 => 1,
    10 => 2,
    12 => 2,
];
println!("Before: {:?}", partition_vec.as_slice());

let sets: Vec<_> = partition_vec.all_sets_mut().collect();
sets.into_par_iter().for_each(|set| {
    for (_index, value) in set {
        *value = (*value + 1) * 10;
    }
});

println!("After: {:?}", partition_vec.as_slice());

The output:

Before: [2, 4, 6, 8, 10, 12]
After: [30, 50, 70, 90, 110, 130]

The rest of the problem lies on building this partitioned vector, but the crate already has facilities for turning a standard Vec into a PartitionedVec and back. By default, each value is assigned to a singleton set. The function compute_indices() proposed in the question would manipulate this vector to create the intended sets beforehand.

¹ Probably due to an implementation detail (as of version 0.2.4), the corresponding iterator for immutable access, obtained with all_sets(), cannot be safely moved between threads, making it unsuitable for parallel processing.

static guarantee is made with a path-dependent type tying the deferred memory safety: the pseudo-pointers (e.g., indices into vectors), if used object has exactly one access path, however: there is no aliasing of heap references. disjoint-lifetime and disjoint-mutability properties, let us consider how these limitations. I have 10 threads and a Vec of length 100. Can I have thread 0 work on elements 0-9 (sort them, for example), while thread 1 is working on elements 10-19, etc.? Or do I have to use a Vec&lt;Vec&l

'Small vector' optimization: store up to a small number of items on the stack v 0.4.6 420 sys #big-data #keyvaluestore #fst #search A Vec-like collection which guarantees stable indices and features O(1) Arbitrary precision integers library A library providing simultaneous mutable access to disjoint portions values� With very large vectors this is needlessly memory heavy compared to the loop, so it ends up losing. Simultaneous mutable access to arbitrary indices of a large

PDF | The disjoint set union problem is a basic problem in data structures with a wide variety of applications. We extend a known efficient sequential | Find, read and cite all the research you

Set interface doesn’t allow random-access to an element in the Collection. You can use iterator or foreach loop to traverse the elements of a Set. 3.4) List Interface. List is an ordered collection and can contain duplicate elements. You can access any element from its index. List is more like array with dynamic length.

Comments
  • @Stargateur That method splits a slice into two contiguous slices. The OP wants to split data by a set of indices.
  • IMHO It is not possible in the safe rust, you should use unsafe rust.
  • If the compiler is not able to verify that an operation is safe, but you can proof that it is, using unsafe code can be a good choice. One option would be to wrap the objects in the vector in an UnsafeCell.
  • @SvenMarnach As far as I see, the UnsafeCell would still not be usable in a parallel iterator, since it's not Sync. I could make a custom type and unsafe impl Sync like it shows in the standard library documentation of struct UnsafeCell, but I'm not sure what responsibilities then fall onto my shoulders. If you feel comfortable with it, feel free to expand your comment to an answer.
  • Great answer, thanks. It looks like some features used here weren't available back when I asked the question (as_slice_of_cells was stabilized in august 2019), but at this point, this is probably the best answer.
  • Note: iterating the indices_to_update in reverse order would avoid the index juggling. Apart from that, this seems to be the best way to do this in safe Rust.
  • Thanks, that looks like the basis for a great solution. If I were to swap the HashSet for something that's easy to recursively split in two (e.g., binary tree, or perhaps hibitset), I could recursively split the vector in two using split_at_mut. This fits nicely onto Rayon's fork/join model. It'll probably be somewhat slower than a 'dirty' unsafe solution, but it does show that the safe abstraction offered by Vec gets us very far. I'm considering your submission as the answer, but I'll wait a bit longer to see if anyone else has neat insights like yours, Sven's or E_net4's.
  • Hi, thanks for your contribution. I have two questions about your answer. First: my question was about parallel iteration over the results. Your solution does not provide support for that as-is, and I don't see how to easily add that due to your use of btree_set::range. Can you clarify how you would suggest to make your solution (rayon-)parallelizable? Second — and perhaps also because of the need for parallelism — I wonder why this solution would be correct without the use of e.g. a std::cell::Cell to inform the compiler about unexpected mutability.
  • @Thierry yes, connecting the concepts would be useful, wouldn't it? I've updated with an example. Does that answer both pieces?
  • Thanks again. Storing the indices in a vector seems suboptimal, and I'm not sure what par_bridge does internally, but it looks like it too is not as good as an optimal solution could be; if it is safe to index as you propose. Inspired by your solution, I think manually checking the lower and upper bound of the indices, and then using the par_iter directly on the BTreeSet would be better. But that's because I envision the BTreeSet to play nice with rayon's fork/join semantics, but I honestly get lost in the code implementing the parallel iterator. Regardless, thanks, +1
  • Thanks, that's an interesting approach. I don't really get your proposed solution, though. Building a new PartitionVec in each iteration of while has_things_to_do() { (in my example) beats the point of not iterating over all elements. Sadly, the partitions crate doesn't seem to offer be a way of resetting only the 'meta' data while keeping the actual data. Judging from the implementation that would require a number of steps in O(n) anyway, thus suffer the same shortcoming. At this point, I don't think this answers my question. Do comment or edit if I misunderstood.