0 votes
1 view
in SQL by (21k points)

How would you design a database to support the following tagging features:

  • items can have a large number of tags
  • searches for all items that are tagged with a given set of tags must be quick (the items must have ALL tags, so it's an AND-search, not an OR-search)
  • creating/writing items may be slower to enable quick lookup/reading

Ideally, the lookup of all items that are tagged with (at least) a set of n given tags should be done using a single SQL statement. Since the number of tags to search for as well as the number of tags on any item is unknown and may be high, using JOINs is impractical.

Any ideas?

Thanks for all the answers so far.

If I'm not mistaken, however, the given answers show how to do an OR-search on tags. (Select all items that have one or more of n tags). I am looking for an efficient AND-search. (Select all items that have ALL n tags - and possibly more.)

1 Answer

0 votes
by (37.4k points)

AND Logical Operator: It seems like you are looking for the "relational division" operation. 

You can refer to this article on tagging Database schemas:

http://howto.philippkeller.com/2005/04/24/Tags-Database-schemas/

Performance: The bitmap-based approach intuitively seems like it will suit the situation well. 

For performance tests you can refer to below link:

http://howto.philippkeller.com/2005/06/19/Tagsystems-performance-tests/

Some DBMS (including Oracle) offers bitmap indexes which may somehow be of use because the built-in indexing system does away with the potential complexity of index maintenance; 

Additionally, the DBMS which offers bitmap indexes will be able to consider them in a proper while performing the query plan.

...