Strange behavior with SQL server IN clause
I am wondering why the below SQL statement behaves the way that it does:
select * from tableA where document_id in (select document_id from tableB where batch_id = 99997)
tableA
contains the column document_id
but tableB
does not, so the query appears to return all rows in tableA
, this behavior occurs if you use any field name in the select statement of the IN clause that is a field in tableA
. Using a name not in tableA
or tableB
does result in an error.
It's not an error. In a subquery, you can still use columns form the parent. So when you say
SELECT document_id FROM tableB WHERE batch_id = 99997
You are saying for every row where batch_id
is 9997 in tableB, select document_id
from tableA
. And of course, all those document_id
values exists, so it will return all those rows.
Odd inconsistent behavior of SQL Server "IN" clause, NOT IN will not evaluate to true when one or more NULL values are returned by the subquery. Use NOT EXISTS to avoid this common gotcha. The SQL WITH clause was introduced by Oracle in the Oracle 9i release 2 database. The SQL WITH clause allows you to give a sub-query block a name (a process also called sub-query refactoring), which can be referenced in several places within the main SQL query. The clause is used for defining a temporary relation such that the output of this
This is why I'd suggest you get in the habit of putting the explicit table name for each column. It is also helpful for maintainability or later extending queries.
select * from tableA A where A.document_id in (select B.document_id from tableB B where B.batch_id = 99997)
If you qualify the tables like this, it will throw a clear error on runtime and prevent any subtle mistakes. This would give the error similar to
tableB.document_id does not exist.
Strange behavior with "where not in" clause – SQLServerCentral, NOT IN will not evaluate to true when one or more NULL values are returned by the subquery. Use NOT EXISTS to avoid this common gotcha. See this answer APPLIES TO: SQL Server Azure SQL Database Azure Synapse Analytics (SQL DW) Parallel Data Warehouse. Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement.
I've just found this behaviour myself after many years. I use Exists() instead of the IN() clause except for ad-hoc and non data modifying queries because of unexpected results in the past with null values, but this had me flummoxed until I read DavidG's response. There is no reason to write a query in such a manner, referencing an outer column without referencing the subqueries table at all but a syntax checker would have trouble disallowing it.
Mostly if using IN() with a sub query you would be referencing a foreign key which, 95+% of the time (with our company standards), has the same name as the primary key, but those 5% of times could catch you out.
Although it seems this behavior also occurs with Exists()...
Select * from tableA Where Exists(Select document_id from tableB)
...Exists() is never written this way because it makes no sense but is written more explicitly something like...
Select * From tableA Where Exists(Select * from tableB tB where tB.TableB_Document_id = tableA.document_id)
and so less likely to occur.
(Note: I would've put this in a comment but I don't have the points)
Declared Variable in WHERE clause weird behavior , i've two tables that i'm comparing to generate a difference set. in the past, i've used a "not in" clause to compare attributes since it seems to He has authored 12 SQL Server database books, 32 Pluralsight courses and has written over 5000 articles on the database technology on his blog at a https://blog.sqlauthority.com. Along with 16+ years of hands-on experience he holds a Masters of Science degree and a number of database certifications.
Sql Server In Clause With Multiple Literal Values Performance , Declared Variable in WHERE clause weird behavior – Learn more on the SQLServerCentral forums. SQL Server Execution Times: CPU time Now's where it gets weird and I do not understand the behavior: Run the same query from Feb 1 - Feb 5 and DB Engine decides to do a table scan (not fast, not happy) Feb 1 - Feb 4 Index seek, great!
Deploying Visual FoxPro Solutions, Sql Server In Clause With Multiple Literal Values Performance Behaviour It's good to know the query optimizer default behavior, So you can Moderator replied the topic: Re: Strange behavior using JOIN clause through linked server Yes, this is a limitation of the Trial version. The OLEDB provider limits the resulting rowset size to 100 rows.
Subquery in the WHERE Clause, If we use SQL Server as an example, you can perform complete or partial Index changes Corruptindexes can trigger strange behavior with sorts and ordering, to the SQL SelectoRDER BY clause, and cause your application to trigger the Learn SQL by doing 50+ interactive coding exercises. Start now!
Comments
- Not strange at all, this is by (flawed) design and won't be fixed, however it has been reported as a bug about a dozen times. See msdn2.microsoft.com/en-us/library/ms187638.aspx connect.microsoft.com/SQL/feedback/details/338468 connect.microsoft.com/SQL/feedback/details/302281 connect.microsoft.com/SQL/feedback/details/542289 connect.microsoft.com/SQL/feedback/details/735178 connect.microsoft.com/SQL/feedback/details/786947 connect.microsoft.com/SQL/feedback/details/499463 362016 624370 etc etc
- If you don't have the rep to leave comments, you should look to answer questions not requiring clarifications or ask your own questions. Not abuse other mechanisms of SO. If we wanted low rep users to be able to comment, they'd have that ability. You're not going to earn rep by doing things like this.
- As said, this is a comment, not an answer. StackOverflow isn't a forum. Take the tour: "Ask questions, get answers, no distractions"
- @CertainPerformance. Although this is only the 2nd time I've replied in SO I use it often for finding solutions to problems or as code reminders. Often the useful bit of info is not a direct answer but buried in a comment. I do understand the need for strictures lest useful replies get lost in waffle, as long as the strictures use one sense of the word and are not driven by the other. Perhaps my biggest mistake was mentioning at the end that i would have put it in comment if I could, because arsebin3 answer is also effectively a comment in much the same vein as mine.