The NULL Trap: When WHERE <> Doesn't Do What You Think

Earlier this year, we had a bug that slipped through code review, made it to production, and took us a day to diagnose. The query was simple (a hypothetical example, the real query in our internal system is different):

SELECT * FROM message_threads WHERE assignee_id != user_1_id;

We wanted to get all threads not assigned to user_1. That should include threads assigned to user_2, user_3, user_4, and—critically—threads with no assignee at all (unassigned threads).

It didn’t. The unassigned threads were missing from the results.

The Assumption That Broke Us

We assumed != works like a boolean filter: “give me everything that’s not this value.” In most programming languages, that’s what != does. If you have [foo, bar, null] and filter by != 'bar', you get [foo, null].

Not in SQL.

Postgres (and standard SQL) uses three-valued logic: TRUE, FALSE, and UNKNOWN. Any comparison with NULL yields UNKNOWN, not TRUE or FALSE. When you write:

WHERE column != 'bar'

Postgres evaluates each row:

'foo' != 'bar' → TRUE (included)
'bar' != 'bar' → FALSE (excluded)
NULL != 'bar' → UNKNOWN (excluded)

UNKNOWN is treated as FALSE in WHERE clauses. So NULLs are always filtered out when using !=, =, <, >, or any standard comparison operator.

Our query WHERE assignee_id != user_1_id was returning only threads with a non-null assignee that wasn’t user_1. All unassigned threads (where assignee_id IS NULL) disappeared.

The Real Impact

In our real system (not the hypothetical example on top), it was a query that validates against ticket duplication. The query was allowing all duplicates because certain tickets were missing a relation (by design) thus, each time the user scans the ticket, it will not be considered a duplicate at first then the system behaves differently.

The fix wasn’t just “add OR assignee_id IS NULL“—it was understanding that our mental model of inequality was wrong for Postgres.

How to Actually Get “Not This Value” Including NULLs

Postgres gives you a few correct ways to express “not equal, treating NULL as a distinct value.”

Option 1: `IS DISTINCT FROM`

SELECT * FROM message_threads WHERE assignee_id IS DISTINCT FROM user_1_id;

This is the NULL-safe inequality operator. It returns TRUE for:

Any value that is not user_1_id
NULL values (because NULL IS DISTINCT FROM user_1_id is TRUE)

It returns FALSE only when the value is exactly user_1_id.

Option 2: Explicit NULL handling

SELECT * FROM message_threads WHERE assignee_id != user_1_id OR assignee_id IS NULL;

This is what we initially patpped. It works, but it’s verbose and easy to forget the OR clause.

Performance and Edge Cases

IS DISTINCT FROM is the cleanest, but know these details:

Index usage: IS DISTINCT FROM can use a b-tree index on assignee_id. Postgres can transform it into a range scan. The same goes for the explicit OR assignee_id IS NULL pattern.
NULL vs NULL: NULL IS DISTINCT FROM NULL returns FALSE (they are not distinct; both are NULL). If you want NULLs to match each other, use IS NOT DISTINCT FROM (the NULL-safe equality).
Portability: IS DISTINCT FROM is standard SQL and supported by Postgres, MySQL (as <=> for equality only), and others. Check your database’s support.
Readability: IS DISTINCT FROM reads clearly once you know it exists. The explicit OR pattern is more universally understood but more verbose.

Conclusion

The bug wasn’t about knowing IS DISTINCT FROM. It was about questioning your thinking model.

We wrote != user_1_id assuming it meant “not user_1.” In SQL, it means “not user_1 and not unknown.” That subtle difference is why NULLs disappeared. After this incident, we audit all inequality queries on nullable columns. We ask: “Do we want NULLs here?” If yes, we use IS DISTINCT FROM or add the explicit null check.

More broadly: when a query returns fewer rows than expected, check your NULL logic. A missing IS NULL clause is a common source of silent data loss.