0 votes
1 view
in SQL by (20.3k points)

In trying to improve the speed of an immensely slow query (several minutes on two tables with only ~50,000 rows each, on SQL Server 2008 if it matters), I narrowed down the problem to an OR in my inner join, as in:

SELECT mt.ID, mt.ParentID, ot.MasterID

  FROM dbo.MainTable AS mt

  INNER JOIN dbo.OtherTable AS ot ON ot.ParentID = mt.ID

                                  OR ot.ID = mt.ParentID

I changed this to (what I hope is) an equivalent pair of left joins, shown here:

SELECT mt.ID, mt.ParentID,

   CASE WHEN ot1.MasterID IS NOT NULL THEN

      ot1.MasterID ELSE

      ot2.MasterID END AS MasterID

  FROM dbo.MainTable AS mt

  LEFT JOIN dbo.OtherTable AS ot1 ON ot1.ParentID = mt.ID

  LEFT JOIN dbo.OtherTable AS ot2 ON ot2.ID = mt.ParentID

  WHERE ot1.MasterID IS NOT NULL OR ot2.MasterID IS NOT NULL

.. and the query now runs in about a second!

Is it generally a bad idea to put an OR in a join condition? Or am I just unlucky somehow in the layout of my tables?

1 Answer

0 votes
by (40.3k points)

This kind of JOIN is not optimizable to a HASH JOIN or a MERGE JOIN.

It can be expressed as a concatenation of two resultsets:

SELECT  *

FROM    maintable m

JOIN    othertable o

ON      o.parentId = m.id

UNION

SELECT  *

FROM    maintable m

JOIN    othertable o

ON      o.id = m.parentId

, each of them being an equijoin, however, SQL Server's optimizer is not smart enough to see it in the query you wrote (though they are logically equivalent).

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...