Strange behavior with SQL server IN clause

I am wondering why the below SQL statement behaves the way that it does:

select * 
from tableA 
where document_id in (select document_id
                      from  tableB 
                      where batch_id = 99997)

tableA contains the column document_id but tableB does not, so the query appears to return all rows in tableA, this behavior occurs if you use any field name in the select statement of the IN clause that is a field in tableA. Using a name not in tableA or tableB does result in an error.

It's not an error. In a subquery, you can still use columns form the parent. So when you say

SELECT document_id FROM tableB WHERE batch_id = 99997

You are saying for every row where batch_id is 9997 in tableB, select document_id from tableA. And of course, all those document_id values exists, so it will return all those rows.

Odd inconsistent behavior of SQL Server "IN" clause, NOT IN will not evaluate to true when one or more NULL values are returned by the subquery. Use NOT EXISTS to avoid this common gotcha. The SQL WITH clause was introduced by Oracle in the Oracle 9i release 2 database. The SQL WITH clause allows you to give a sub-query block a name (a process also called sub-query refactoring), which can be referenced in several places within the main SQL query. The clause is used for defining a temporary relation such that the output of this

This is why I'd suggest you get in the habit of putting the explicit table name for each column. It is also helpful for maintainability or later extending queries.

select * 
from tableA A
where A.document_id in (select B.document_id
                  from  tableB B
                  where B.batch_id = 99997)

If you qualify the tables like this, it will throw a clear error on runtime and prevent any subtle mistakes. This would give the error similar to

tableB.document_id does not exist.

Strange behavior with "where not in" clause – SQLServerCentral, NOT IN will not evaluate to true when one or more NULL values are returned by the subquery. Use NOT EXISTS to avoid this common gotcha. See this answer  APPLIES TO: SQL Server Azure SQL Database Azure Synapse Analytics (SQL DW) Parallel Data Warehouse. Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement.

I've just found this behaviour myself after many years. I use Exists() instead of the IN() clause except for ad-hoc and non data modifying queries because of unexpected results in the past with null values, but this had me flummoxed until I read DavidG's response. There is no reason to write a query in such a manner, referencing an outer column without referencing the subqueries table at all but a syntax checker would have trouble disallowing it.

Mostly if using IN() with a sub query you would be referencing a foreign key which, 95+% of the time (with our company standards), has the same name as the primary key, but those 5% of times could catch you out.

Although it seems this behavior also occurs with Exists()...

Select * from tableA Where Exists(Select document_id from tableB)

...Exists() is never written this way because it makes no sense but is written more explicitly something like...

Select * From tableA Where Exists(Select * from tableB tB where tB.TableB_Document_id = tableA.document_id)

and so less likely to occur.

(Note: I would've put this in a comment but I don't have the points)

Declared Variable in WHERE clause weird behavior , i've two tables that i'm comparing to generate a difference set. in the past, i've used a "not in" clause to compare attributes since it seems to  He has authored 12 SQL Server database books, 32 Pluralsight courses and has written over 5000 articles on the database technology on his blog at a Along with 16+ years of hands-on experience he holds a Masters of Science degree and a number of database certifications.

Sql Server In Clause With Multiple Literal Values Performance , Declared Variable in WHERE clause weird behavior – Learn more on the SQLServerCentral forums. SQL Server Execution Times: CPU time  Now's where it gets weird and I do not understand the behavior: Run the same query from Feb 1 - Feb 5 and DB Engine decides to do a table scan (not fast, not happy) Feb 1 - Feb 4 Index seek, great!

Deploying Visual FoxPro Solutions, Sql Server In Clause With Multiple Literal Values Performance Behaviour It's good to know the query optimizer default behavior, So you can  Moderator replied the topic: Re: Strange behavior using JOIN clause through linked server Yes, this is a limitation of the Trial version. The OLEDB provider limits the resulting rowset size to 100 rows.

Subquery in the WHERE Clause, If we use SQL Server as an example, you can perform complete or partial Index changes Corruptindexes can trigger strange behavior with sorts and ordering, to the SQL SelectoRDER BY clause, and cause your application to trigger the  Learn SQL by doing 50+ interactive coding exercises. Start now!