Select rows which are not present in other table

Question

asked Jul 17, 2019 in SQL by Tech4ever (20.3k points)

I've got two postgresql tables:

table name column names
----------- ------------------------
login_log ip | etc.
ip_location ip | location | hostname | etc.

I want to get every IP address from login_log which doesn't have a row in ip_location.

I tried this query but it throws a syntax error.

SELECT login_log.ip
FROM login_log
WHERE NOT EXIST (SELECT ip_location.ip
FROM ip_location
WHERE login_log.ip = ip_location.ip)

ERROR: syntax error at or near "SELECT"

LINE 3: WHERE NOT EXIST (SELECT ip_location.ip`

I'm also wondering if this query (with adjustments to make it work) is the best performing query for this purpose.

2 Answers

Related questions

0 votes

1 answer

SQL SELECT speed int vs varchar

asked Jul 26, 2019 in SQL by Tech4ever (20.3k points)

0 votes

1 answer

PostgreSQL error: Fatal: role “username” does not exist

asked Dec 9, 2020 in SQL by Appu (6.1k points)

0 votes

1 answer

Decode equivalent in postgres

asked Dec 16, 2020 in SQL by Appu (6.1k points)

0 votes

1 answer

How do I get the MIN() of two fields in Postgres?

asked Jul 24, 2019 in SQL by Tech4ever (20.3k points)

0 votes

1 answer

How to create derived attribute?

asked Jan 1, 2021 in SQL by Appu (6.1k points)

Soni Kumari · Answer 1 · 2019-07-17T12:29:57+0000

Following 4 methods to select rows that are not present in other tables, all of them are standard SQL.

1. NOT EXISTS

For more information refer to this link:

https://www.postgresql.org/docs/current/functions-subquery.html#FUNCTIONS-SUBQUERY-EXISTS

This is the fastest in Postgres.

SELECT ip
FROM login_log l
WHERE NOT EXISTS (
SELECT -- SELECT list mostly irrelevant; can just be empty in Postgres
FROM ip_location
WHERE ip = l.ip
);

In EXISTS subqueries, what is easier to read?

2. Use LEFT JOIN / IS NULL

For more information refer to this link:

https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-FROM

Sometimes this is the fastest. But, often the shortest. Most of the time results in the same query plan as NOT EXISTS.

SELECT l.ip
FROM login_log l
LEFT JOIN ip_location i USING (ip) -- short for: ON i.ip = l.ip
WHERE i.ip IS NULL;
Use EXCEPT

3. EXCEPT

For more information refer to this link:

https://www.postgresql.org/docs/current/queries-union.html

The below code will not be easily integrated into more complex queries.

SELECT ip
FROM login_log
EXCEPT ALL -- Here, "ALL" keeps duplicates and makes it faster
SELECT ip
FROM ip_location;

Note: Duplicates will be eliminated unless EXCEPT ALL is used.

Typically, you need the ALL keyword. Even if you don't care, use it because it makes the query faster.

4. Use NOT IN

For more information refer to this link:

https://www.postgresql.org/docs/current/functions-subquery.html#FUNCTIONS-SUBQUERY-NOTIN

It's only good without NULL values or if you know to handle NULL properly.

SELECT ip
FROM login_log
WHERE ip NOT IN (
SELECT DISTINCT ip -- DISTINCT is optional
FROM ip_location
);

NOT IN: It carries a "trap" for NULL values on either side:

Find records where join doesn't exist. Select rows where the value of the second column is not present in the first column.

Chitra · Answer 2 · 2024-11-08T12:00:02+0000

The error is due to a small typo: NOT EXIST should be NOT EXISTS. Here’s the corrected query:

SELECT login_log.ip

FROM login_log

WHERE NOT EXISTS (

SELECT 1

FROM ip_location

WHERE login_log.ip = ip_location.ip

);

Explanation

NOT EXISTS: This checks for the absence of rows from the ip_location table where the IP address matches. If there’s no match, the row from login_log is returned.

SELECT 1: It’s a common practice to use SELECT 1 within EXISTS subqueries for readability and slight optimization.

Performance Consideration:

For better performance, consider creating an index on ip_location.ip if you haven’t already. This will speed up the subquery lookup and make the NOT EXISTS check faster for large tables.

Alternatively, another efficient approach could be to use a LEFT JOIN:

SELECT login_log.ip

FROM login_log

LEFT JOIN ip_location ON login_log.ip = ip_location.ip

WHERE ip_location.ip IS NULL;

This LEFT JOIN will join rows where there’s no match in ip_location and only return rows where ip_location.ip is NULL

Select rows which are not present in other table

2 Answers

Related questions

Browse Categories