0 votes
1 view
in SQL by (20.3k points)

In trying to improve the speed of an immensely slow query (several minutes on two tables with only ~50,000 rows each, on SQL Server 2008 if it matters), I narrowed down the problem to an OR in my inner join, as in:

SELECT mt.ID, mt.ParentID, ot.MasterID

  FROM dbo.MainTable AS mt

  INNER JOIN dbo.OtherTable AS ot ON ot.ParentID = mt.ID

                                  OR ot.ID = mt.ParentID

I changed this to (what I hope is) an equivalent pair of left joins, shown here:

SELECT mt.ID, mt.ParentID,

   CASE WHEN ot1.MasterID IS NOT NULL THEN

      ot1.MasterID ELSE

      ot2.MasterID END AS MasterID

  FROM dbo.MainTable AS mt

  LEFT JOIN dbo.OtherTable AS ot1 ON ot1.ParentID = mt.ID

  LEFT JOIN dbo.OtherTable AS ot2 ON ot2.ID = mt.ParentID

  WHERE ot1.MasterID IS NOT NULL OR ot2.MasterID IS NOT NULL

.. and the query now runs in about a second!

Is it generally a bad idea to put an OR in a join condition? Or am I just unlucky somehow in the layout of my tables?

1 Answer

0 votes
by (36.7k points)

This kind of JOIN is not optimizable to a HASH JOIN or a MERGE JOIN.

It can be expressed as a concatenation of two resultsets:

SELECT  *

FROM    maintable m

JOIN    othertable o

ON      o.parentId = m.id

UNION

SELECT  *

FROM    maintable m

JOIN    othertable o

ON      o.id = m.parentId

, each of them being an equijoin, however, SQL Server's optimizer is not smart enough to see it in the query you wrote (though they are logically equivalent).

...