How do I load a random document from CouchDB (efficiently and fairly)?

couchdb linked documents
random document generator
random document pdf
couchdb schema
couchdb design document
couchdb change _id
couchdb revision
couchdb database design

I would like to load a random document out of a set of documents stored in a CouchDB database. The method for picking and loading the document should conform to the following requirements:

  • Efficiency: The lookup of the document should be efficient, most importantly the time to load the document must not grow linearly with the total number of documents. This means the skip query argument cannot be used.

  • Uniform distribution: The choice should be truly random (as far as possible, using standard random number generators), every document should have equal chances of being chosen.

What is the best way to implement this in CouchDB?


Random Documents from CouchDB, The method for picking and loading the document should conform to the following How do I load a random document from CouchDB (efficiently and fairly)?. 24 How do I load a random document from CouchDB (efficiently and fairly)? Sep 24 '10. 18 Compare two vectors in clojure no matter the order of the items Dec 5 '11.


If insert performance is not an issue you could try to make number non random e.g. make it doc_count + 1 at the time of creation. Then you could look it up with a random number 0 <= r < doc_count. But that would either require to synchronize the creation of documents or have sequence external to couchdb, e.g. an SQL database.

Best regards

Felix

User Christian Berg, I was finding ways to return a random document from CouchDB. At a former project at Pressyo we had used emit(Math.random(), doc) , but I wasn't quite happy with probability that a document will be selected is now effectively halved. Flip a coin (or generate a binary random number); Put r into the view  CouchDB’s JSON documents are great for programmatic access in most environments. Almost all languages have HTTP and JSON libraries, and in the unlikely event that yours doesn’t, writing them is fairly simple. However, there is one important use case that JSON documents don’t cover: building plain old HTML web pages.


How about "abusing" a view's reduce function?

function (keys, values, reduce) {
    if (reduce)
      return values[Math.floor(Math.random()*values.length)];
    else
      return values;
}

4.1. Document Design Considerations, How do I load a random document from CouchDB (efficiently and fairly)? · random couchdb asked Sep 23 '10 at 14:48. stackoverflow.com · 8 votes  [ Natty] random How do I load a random document from CouchDB (efficiently and fairly)? By: Robert Sirois 1.5 ; [ Natty ] regex Strange RegEx behaviour - AND/OR operator By: Sohail Ahmed 0.5 ;


I agree with @meliodas:

Here's the distribution of option 2 (n=1000):

{ 0.2: 233,
  0.9: 767 }

And with swapping startkey/endkey half the time:

{ 0.2: 572,
  0.9: 428 }

Not sure what happens to the distribution when you look at more data, but it initially seems a bit more promising. This is without using option 1 at all, which I don't think is necessary.

CouchDB 3.0, While CouchDB will generate a unique identifier for the _id field of any doc that to take advantage of sequencing your own ids more effectively than the automatically would greatly reduce the number of documents you'll put in your database. although generally comparing text files line-by-line rather than comparing  From a security perspective, this is less than ideal. For the longest time, folks have worked around this with path blocking in load balancers. For CouchDB 2.x we wanted to change the /_all_dbs endpoint to be admin-only, but because 2.0 was a fairly large release with many moving parts, we overlooked making this change.


Swift, Example: I'm building a document analysis app that does topic + keyword frequency We have dedicated clusters on Cloudant and they've run quite smoothly for many years. At this point why would you use CouchDB over something like MongoDB? The entirety of CouchDB is built around efficient replication. While it's  WARNING: Do not do this until you have established an admin user and setup permissions correctly on any databases you have created.. If you intend to network this CouchDB instance with others in a cluster, you will need to map additional ports; see the official CouchDB documentation for details.


Evaluation and Testing of Database Backends in an Erlang , The document was successfully persisted to the database. but that data is fairly static and won't change much, it can be a lot more efficient to bundle a database in withConfig: nil) } catch { fatalError("Could not load pre-built database") } } a new document where the document ID is randomly generated by the database. CouchDB also works great with external tools like HTTP proxy servers, load balancers. Offline First Data Sync CouchDB’s unique Replication Protocol is the foundation for a whole new generation of “Offline First” applications for Mobile applications and other environments with challenging network infrastructures .


Couchbase, CouchDB, and give an example of how they can be integrated in an Erlang 8!5 7 & $ & /7 /3 %3 )?8!5 +: Upon load from log file, it is somewhat slow since the records of By implementing the Mnesiaex behavior it is quite easy to add a document is then a bulk document and is executed much more efficiently by the  CouchDB is a database that uses JSON for documents, an HTTP API, & JavaScript/declarative indexing.


Installing Couchbase on Different Operating Systems . The company provided commercial support for the Apache CouchDB open source database. efficient way to access a document, selecting a strategy for creating keys will have a Generating a unique key programmatically rather than via the database can prevent  When documents are committed to disk, the document fields and metadata are packed into buffers, sequentially one document after another (helpful later for efficient building of views). When CouchDB documents are updated, all data and associated indexes are flushed to disk and the transactional commit always leaves the database in a completely