Getting random row through SQLAlchemy
sqlalchemy get all
sqlalchemy get top
sqlalchemy update record
sqlalchemy get limit
How do I select a(or some) random row(s) from a table using SQLAlchemy?
This is very much a database-specific issue.
I know that PostgreSQL, SQLite, MySQL, and Oracle have the ability to order by a random function, so you can use this in SQLAlchemy:
from sqlalchemy.sql.expression import func, select select.order_by(func.random()) # for PostgreSQL, SQLite select.order_by(func.rand()) # for MySQL select.order_by('dbms_random.value') # For Oracle
Next, you need to limit the query by the number of records you need (for example using
Bear in mind that at least in PostgreSQL, selecting random record has severe perfomance issues; here is good article about it.
Getting random row through SQLAlchemy, How do I select a(or some) random row(s) from a table using SQLAlchemy? Answers: This is very much a database-specific issue. I know that import random query = DBSession.query(Table) rowCount = int(query.count()) randomRow = query.offset(int(rowCount*random.random())).first() Where Table is your table (or you could put any query there). If you want a few rows, then you can just run this multiple times, and make sure that each row is not identical to the previous.
If you are using the orm and the table is not big (or you have its amount of rows cached) and you want it to be database independent the really simple approach is.
import random rand = random.randrange(0, session.query(Table).count()) row = session.query(Table)[rand]
This is cheating slightly but thats why you use an orm.
Getting random row through SQLAlchemy - Article, How do I select a(or some) random row(s) from a table using SQLAlchemy? This is very much a database-specific issue. I know that PostgreSQL, SQLite, MySQL All SELECT statements generated by SQLAlchemy ORM are constructed by Query object. It provides a generative interface, hence successive calls return a new Query object, a copy of the former with additional criteria and options associated with it. The query object has all() method which returns a resultset in the form of list of objects.
There is a simple way to pull a random row that IS database independent. Just use .offset() . No need to pull all rows:
import random query = DBSession.query(Table) rowCount = int(query.count()) randomRow = query.offset(int(rowCount*random.random())).first()
Where Table is your table (or you could put any query there). If you want a few rows, then you can just run this multiple times, and make sure that each row is not identical to the previous.
sqlalchemy.func.random Python Example, This page provides Python code examples for sqlalchemy.func.random. def get_recommended_vides_json(video_ID, count=10): """get recommended /blob/master/_posts/2016-02-26-fetch-rows-in-random-order-with-seed-support.md sql I am using SQLAlchemy in Python, and I want to know how to get the total number of rows in a column. I have variables defined: engine = sqlalchemy.create_engine(url, ehco=False) Session = sqlalche
Here's four different variations, ordered from slowest to fastest.
timeit results at the bottom:
from sqlalchemy.sql import func from sqlalchemy.orm import load_only def simple_random(): return random.choice(model_name.query.all()) def load_only_random(): return random.choice(model_name.query.options(load_only('id')).all()) def order_by_random(): return model_name.query.order_by(func.random()).first() def optimized_random(): return model_name.query.options(load_only('id')).offset( func.floor( func.random() * db.session.query(func.count(model_name.id)) ) ).limit(1).all()
timeit results for 10,000 runs on my Macbook against a PostgreSQL table with 300 rows:
simple_random(): 90.09954111799925 load_only_random(): 65.94714171699889 order_by_random(): 23.17819356000109 optimized_random(): 19.87806927999918
You can easily see that using
func.random() is far faster than returning all results to Python's
Additionally, as the size of the table increases, the performance of
order_by_random() will degrade significantly because an
ORDER BY requires a full table scan versus the
optimized_random() can use an index.
Flask-SqlAlchemy, development. Flask is easy to get started and a great way to build web sites and web applications. myModel.query.filter_by(type="apples").order_by(random()).all() Understanding *args and **kwargs with multiple examples in Python. For this purpose Flask-SQLAlchemy provides a query attribute on your Model class. When you access it you will get back a new query object over all records. You can then use methods like filter() to filter the records before you fire the select with all() or first(). If you want to go by primary key you can also use get().
Some SQL DBMS, namely Microsoft SQL Server, DB2, and PostgreSQL have implemented the SQL:2003
TABLESAMPLE clause. Support was added to SQLAlchemy in version 1.1. It allows returning a sample of a table using different sampling methods – the standard requires
BERNOULLI, which return a desired approximate percentage of a table.
# Approx. 1%, using SYSTEM method sample1 = mytable.tablesample(1) # Approx. 1%, using BERNOULLI method sample2 = mytable.tablesample(func.bernoulli(1))
There's a slight gotcha when used with mapped classes: the produced
TableSample object must be aliased in order to be used to query model objects:
sample = aliased(MyModel, tablesample(MyModel, 1)) res = session.query(sample).all()
Since many of the answers contain performance benchmarks, I'll include some simple tests here as well. Using a simple table in PostgreSQL with about a million rows and a single integer column, select (approx.) 1% sample:
In : %%timeit ...: foo.select().\ ...: order_by(func.random()).\ ...: limit(select([func.round(func.count() * 0.01)]). ...: select_from(foo). ...: as_scalar()).\ ...: execute().\ ...: fetchall() ...: 307 ms ± 5.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In : %timeit foo.tablesample(1).select().execute().fetchall() 6.36 ms ± 188 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In : %timeit foo.tablesample(func.bernoulli(1)).select().execute().fetchall() 19.8 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Before rushing to use
SYSTEM sampling method one should know that it samples pages, not individual tuples, so it might not be suitable for small tables, for example, and may not produce as random results, if the table is clustered.
sqlalchemy.func.rand Example, How do I select a(or some) random row(s) from a table using SQLAlchemy? This is very much a database-specific issue. I know that PostgreSQL, This generates the SQL for this Query as follows: SELECT count (1) AS count_1 FROM ( SELECT <rest of query follows> ) AS anon_1 The above SQL returns a single row, which is the aggregate value of the count function; the Query.count () method then returns that single integer value.
Getting random row through SQLAlchemy, Here are the examples of the python api sqlalchemy.func.rand taken from open source projects. :param dialect: the engine dialect (the implementation of random differs chosen dialect, the fallback implementation uses total row count to. SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. It provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language.
16. How to efficiently select a random object from a model , I know that PostgreSQL, SQLite, MySQL, and Oracle have the ability to order by a random function, so you can use this in SQLAlchemy: The following are code examples for showing how to use sqlalchemy.func.row_number().They are from open source Python projects. You can vote up the examples you like or vote down the ones you don't like.
Fetch rows in random order (seeded), To test other methods, we need to insert one million records in Category table. Go to your db like with python manage.py dbshell and run this. table, you can get the max id, generate a random number in range [1, max_id], and filter that. Simpler logic: SQLAlchemy allows us to abstract all of our database logic into Python objects. Instead of having to think on a table, row, and column level, we can consider everything on a class, instance, and attribute level. Getting started with using SQLAlchemy can seem pretty daunting,
- +1. Same as Postgres works for SQLite:
- You can use order_by('dbms_random.value') in Oracle.
- If you are using declarative models:
- Thanks @trinth, it worked when I added paranthesis to the end:
- Since SQLAlchemy v0.4,
func.random()is a generic function that compiles to the database's random implementation.
- rand = random.randrange(0, session.query(Table).count())
- You choose and create all objects before choose one of
- How about
- Update - at around 10 million rows in mysql this actually started to get a little slow I guess you could optimize it.
- Works well for me in a ~500k rows setting.