How to Select 10 Random Rows from 600K Rows Fast in MySQL?

How to Select 10 Random Rows from 600K Rows Fast in MySQL?

To select 10 random rows from 600K rows in MySQL, we have several methods that can be used to get output. Sometimes when we have large datasets and need to filter or get the random number rows, we can use methods like OrderBy Rand(), and Random OFFSET(). In this blog, you will learn all the approaches to selecting 10 random rows from 600k rows in MySQL in detail with examples for each.

Table of Contents:

Let’s create a table called your_table and insert some values into it, which can be used as an example for the following methods. 

-- Create table
CREATE TABLE your_table (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50),
    age INT
);
INSERT INTO your_table (name, age)
SELECT 
    CONCAT('User', id) AS name,
    FLOOR(18 + (RAND() * 42)) AS age
FROM 
    (SELECT @row := @row + 1 AS id FROM 
    (SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t1,
    (SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t2,
    (SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t3,
    (SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t4,
    (SELECT @row := 0) init) temp
LIMIT 600000;
SELECT * FROM your_table

Output:

The table will have 600k rows…

Methods to Select Random 10 Rows from 600k Rows in MySQL

There are methods like ORDER BY RAND(), Random OFFSET(), and user variable. To get the random 10 rows fast.

Method 1: Using ORDER BY Rand() in MySQL

The ORDER BY RAND() is a method in MySQL that can sort all the rows efficiently. This may be inefficient for the larger datasets because it needs to scan the entire table.

Example:

SELECT * FROM your_table
 ORDER BY RAND() 
LIMIT 10;

Output:

Explanation: The RAND() function selected the random 10 rows from 600k rows faster.

Method 2: Using indexed Primary Key in MySQL

This Indexed Primary in MySQL is best when the data in the table is properly indexed and if the table has an auto-increment primary key (id). However, if the data is not orderly, this may leave fewer gaps while fetching the rows.

Example:

SELECT * FROM your_table
WHERE id >= (SELECT FLOOR(RAND() * (SELECT MAX(id) FROM your_table)))
ORDER BY id
LIMIT 10;

Output:

Explanation: The SELECT FLOOR(RAND() * (SELECT MAX(id)) fetched the random 10 rows from the table.

Method 3: Using JOIN with Random OFFSET() in MySQL

The JOIN with Random OFFSET() fetches the random rows without sorting the entire table.

This is much more efficient than ORDER BY RAND()

Example:

-- Declare a variable for the random offset
SET @rand_offset = FLOOR(RAND() * (SELECT COUNT(*) FROM your_table));

-- Prepare and execute the query with the computed offset
SET @query = CONCAT('SELECT * FROM your_table LIMIT 10 OFFSET ', @rand_offset);
PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Output:

Explanation: The offset uses the prepared statements SET @rand_offset = FLOOR(RAND() * (SELECT COUNT(*) FROM your_table)); by this query, get the random rows from a table. As offset will not directly work on MySQL to get the 10 random rows.

Method 4: Using User Variable Method in MySQL

The User Variable method in MySQL assigns a random number to each row without the need to sort the entire table.

Example:

SELECT * FROM (
    SELECT *, (@row_num := @row_num + 1) AS row_num 
    FROM your_table, (SELECT @row_num := 0) AS init
) AS t
WHERE row_num >= FLOOR(RAND() * (SELECT COUNT(*) FROM your_table))
LIMIT 10;

Output:

Explanation: The User variable works well when the id has gaps. The SELECT *, (@row_num := @row_num + 1) AS row_num query will allocate random row numbers for rows.

Performance Comparison

Method Performance ConsiderationEfficiencyBest Use Case
ORDER BY RAND()The ORDER BY RAND() will scan the entire table and sort the rows before randomly selecting 10 rows. Efficiency is slow as it needs to scan the entire table.It will be best when we use it on small tables.  
 Indexed Primary KeyIt uses an indexed ID, so it avoids full scanning of the table.   Fastest, as it doesn’t need to scan the entire table.It is best for large tables. 
JOIN with Random OFFSETThe JOIN with random offset to get random rows from the table.  Moderate as COUNT and offset will make the efficiency slower.If the tables have gaps in ID, then offset is the best practice.
User Variable MethodThe user variable method uses row numbers with defined variables, this will avoid scanning the entire table.It is efficient as it easily assigns a random number to the rows.It performs its best when there are large datasets.

Real-World Examples

Case 1: To select 10 random movie names from the list.

Example:

-- Create a movie table
CREATE TABLE movies (
    id INT AUTO_INCREMENT PRIMARY KEY,
    title VARCHAR(255),
    genre VARCHAR(50),
    release_year INT
);

-- Insert sample movies (15 rows)
INSERT INTO movies (title, genre, release_year) VALUES
('Inception', 'Sci-Fi', 2010),
('Titanic', 'Romance', 1997),
('The Matrix', 'Action', 1999),
('Interstellar', 'Sci-Fi', 2014),
('Joker', 'Drama', 2019),
('Gladiator', 'Action', 2000),
('The Dark Knight', 'Action', 2008),
('Forrest Gump', 'Drama', 1994),
('Parasite', 'Thriller', 2019),
('The Godfather', 'Crime', 1972),
('Avengers: Endgame', 'Superhero', 2019),
('Pulp Fiction', 'Crime', 1994),
('Schindler List', 'Historical', 1993),  
('The Lion King', 'Animation', 1994),
('Fight Club', 'Drama', 1999);

-- Query to select 10 random movies
SELECT * FROM movies
WHERE id >= (SELECT FLOOR(RAND() * (SELECT MAX(id) FROM movies)))
ORDER BY id 
LIMIT 10;

Output:

Explanation: The Indexed primary key function randomly selected 10 movies from the list.

Case 2: To randomly fetch user names in the social media platform.

Example:

-- Create a user table
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50),
    country VARCHAR(50)
);

-- Insert sample users (15 rows)
INSERT INTO users (username, country) VALUES
('user_01', 'USA'),
('user_02', 'Canada'),
('user_03', 'India'),
('user_04', 'Germany'),
('user_05', 'France'),
('user_06', 'Japan'),
('user_07', 'Australia'),
('user_08', 'Brazil'),
('user_09', 'UK'),
('user_10', 'Mexico'),
('user_11', 'Italy'),
('user_12', 'Spain'),
('user_13', 'South Korea'),
('user_14', 'Netherlands'),
('user_15', 'Sweden');

-- Query to select 10 random users using user variables
SELECT * FROM (
    SELECT *, (@row_num := @row_num + 1) AS row_num 
    FROM users, (SELECT @row_num := 0) AS init
) AS t
WHERE row_num >= FLOOR(RAND() * (SELECT COUNT(*) FROM users))
LIMIT 10;

Output:

Explanation: To select a random user name, we used the user-variable method, as it can handle large datasets and can handle gaps in the IDs well.

Case 3: To gather 10 random quizzes from the online learning platform Intellipaat.

Example:

-- Create quiz questions table
CREATE TABLE quiz_questions (
    id INT AUTO_INCREMENT PRIMARY KEY,
    question TEXT,
    difficulty VARCHAR(10)
);

-- Insert sample quiz questions (15 rows)
INSERT INTO quiz_questions (question, difficulty) VALUES
('What is SQL and what does it stand for?', 'Easy'),
('What is the difference between SQL and MySQL?', 'Easy'),
('What are the different types of SQL commands?', 'Easy'),
('What is the purpose of the GROUP BY clause in SQL?', 'Medium'),
('What is the difference between WHERE and HAVING clauses?', 'Medium'),
('What is the use of the ORDER BY clause in SQL?', 'Easy'),
('What is a primary key and why is it important?', 'Easy'),
('What are joins in SQL? Name different types of joins.', 'Medium'),
('What is normalization in SQL? Explain different normal forms.', 'Hard'),
('What is an index in SQL, and how does it improve performance?', 'Medium'),
('What is the difference between DELETE, TRUNCATE, and DROP?', 'Medium'),
('What is a stored procedure in SQL?', 'Medium'),
('What are ACID properties in a database?', 'Hard'),
('What is a foreign key in SQL?', 'Easy'),
('Explain the concept of a subquery with an example.', 'Hard');
SELECT * FROM (
    SELECT * FROM quiz_questions ORDER BY RAND() LIMIT 1000
) AS subset ORDER BY RAND()
LIMIT 10;

Output:

Explanation: The randomly selected quiz question from the Intellipaat learning platform has been achieved by subset+order by rand() function.

Best Practices

  1. Avoid using ORDER BY RAND() on large datasets, as it requires sorting the entire table, making it inefficient
  2. Use indexed Columns for faster optimization. As it uses an auto-incremented primary key, it will help to find random variables easily.
  3. Make sure you use proper selection methods based on your dataset.
  4. For very large tables, consider using the indexed columns and maintaining a secondary table with random row selections to improve efficiency
  5. Try to save all the frequently generated random sets in a separate space. This will reduce the cache space and reduce the repetition of the database.

Conclusion

In conclusion, randomly selecting 10 rows from 600k rows quickly in MySQL can be done by using methods like ORDER BY RAND(), JOIN with OFFSET(), indexed primary key, etc. Choose the best method based on its efficiency, performance speed, and dataset size. By this, you can retrieve any random rows from any table. In this blog, you have gained knowledge on how to select random rows from multiple records.

To learn more about SQL functions, check out this SQL course and also explore SQL Interview Questions prepared by industry experts.

How to Select 10 Random Rows from 600K Rows Fast in MySQL – FAQs

1. How do I select only 10 rows in MySQL?

Use LIMIT 10 in MySQL to select only 10 rows.

2. How to select a random 1000 rows in SQL Server?

Use TABLESAMPLE or NEWID() with TOP 1000 in SQL Server to select random 1000 rows.

3. How do I select 10 random rows from a table?

Use ORDER BY RAND() with LIMIT 10 to select 10 random rows from a table.

4. How do I select more than 1000 rows in MySQL?

Use LIMIT 1000, OFFSET x, or no LIMIT to select more than 1000 rows in MySQL.

5. How do I select 10 random rows in MySQL?

Use efficient methods like indexed primary key, ORDER BY RAND(), or JOIN with random OFFSET() to select 10 random rows in MySQL.

About the Author

Data Engineer

As a skilled Data Engineer, Sahil excels in SQL, NoSQL databases, Business Intelligence, and database management. He has contributed immensely to projects at companies like Bajaj and Tata. With a strong expertise in data engineering, he has architected numerous solutions for data pipelines, analytics, and software integration, driving insights and innovation.

business intelligence professional