Aaron Saray

open source programmer,
web developer

entrepreneur, author
and musician

My Blog

contains PHP, Web and business/entrepreneurial related content. Please join in the conversation!

Random user generation – optimized

I came across this blog posting about optimizing order by rand() and decided to make my queries better. Here is my real life example on how to optimize this query:

Normally, when you click the random link on jemdiary (jemdiary.com/read/random), you are retrieving a user’s username to view their diary. This user has to have their account flagged to allow random reads, and not be globally password protected. Finally, they need to be an active user. The settings are in the Tsettings table, the users are in the Tuser table. See my current sql statement:

1
2
3
4
5
6
7
select u.userID, u.userName
        from Tuser u
        inner join TuserSettings s
            on u.userID=s.userID
        where s.allowRandom=1 and s.globalPasswordProtect=0 and u.status='A'
        order by rand()
        limit 1

This, unfortunately is going to be slow because of the order by portion of the query. Using what I learned in that blog post, I was able to modify the sql to take advantage of their speedy query. Its important to know that I have only deleted a few rows from the user-database. About 3% are missing. The rest have just been inactivated (eh… design changes as time goes on ya know ;) ). So, I didn’t implement the balancing algorithm. Additionally, I added a few extra items in for testing, so I could make sure that my users were actually validating correctly. See this sql:

1
2
3
4
5
6
7
8
9
10
11
12
select u.userID, u.userName, s.globalPasswordProtect, s.allowRandom
        from Tuser u JOIN
            (select (rand() *
                (select max(userID)
                    from Tuser)) as id)
            AS r2
            inner join TuserSettings s
            on u.userID=s.userID
        where u.userID >= r2.id and
        s.allowRandom=1 and s.globalPasswordProtect=0 and u.status='A'
        order by u.userID ASC
        limit 1

This was a nice fast way to do it. (The production version of this only selects u.userID and u.userName).

This entry was posted in SQL and tagged . Bookmark the permalink.

2 Responses to Random user generation – optimized

  1. Tord says:

    Nice solution but it will only work for selecting one random result from the table, or am i wrong.

    Shore it will give a different sub table each time but it’s only the first ‘id’ that will be random since you do “where u.userID >= r2.id” this will give a sub set of rows starting from userID greather then the random number.

    In your case you Limit the result to 1 “one” row, but what if you need more like five or ten randomly selected rows?

    Just wanted to give my feedback and if you or anyone else has a tip or solution in store for selecting a greater random generated sub set I’d love to here about it.

    • Aaron says:

      @Tord: You are correct – it will only choose one random row. I’ve never really chosen more than one random row at a time. I think what I would do in that case is to write a stored procedure and execute that. True, it has to be crafted (its easy to just say “write one”) – but thats the direction I would start to look at.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>