Sunday, February 26, 2012

Fun (trouble) with outer join

Okay, I'm having an issue with an outer join that is just confounding
me! Can somebody help explain. I started out with a large query that
was not returning all of the data that I wanted, so i paired it down to
this simple outer join:
SELECT E.EMP_RNG_CDE
FROM DATA_TABLE B LEFT OUTER JOIN
CODE_TABLE E ON E.EMP_RNG_CDE =B.EMP_RNG_CDE
WHERE B.POST_CDE = '60503'
AND B.SEG_CDE = '5'
There are 6 rows in the in "CODE_TABLE" .. while only 2 rows in
"DATA_TABLE" will match.. so I want all 6 out of the "CODE_TABLE" every
time (hence my OUTER JOIN...)
my results look like this however:
EMP_RNG_CDE
1
2
-- I only received the rows that matched from B.. not the outer rows
found in E!
HOWEVER now this drives me really crazy.. if re-work the query to join
my tables in the WHERE Clause and use the "*=" syntax for my left outer
join.. it works just right!
SELECT E.EMP_RNG_CDE
FROM DATA_TABLE B, LEFT OUTER JOIN
CODE_TABLE E
WHERE E.EMP_RNG_CDE *= B.EMP_RNG_CDE AND
B.POST_CDE = '60503'
AND B.SEG_CDE = '5'
Returns:
EMP_RNG_CDE
1
2
3
4
5
6
? Aren't those 2 queries the same? Just with different syntax for the
LEFT OUTER JOIN '
Can somebody help me with why those two are different? And how I can
get the results from the 2nd query with a "OUTER JOIN" syntax in my
FROM clause?
thansks!
jeffprizJeff,
Try switching the order of the tables in the join clause or try a RIGHT
JOIN.
HTH
Jerry
<jeffpriz@.yahoo.com> wrote in message
news:1128372693.433884.78940@.g14g2000cwa.googlegroups.com...
> Okay, I'm having an issue with an outer join that is just confounding
> me! Can somebody help explain. I started out with a large query that
> was not returning all of the data that I wanted, so i paired it down to
> this simple outer join:
> SELECT E.EMP_RNG_CDE
> FROM DATA_TABLE B LEFT OUTER JOIN
> CODE_TABLE E ON E.EMP_RNG_CDE =B.EMP_RNG_CDE
> WHERE B.POST_CDE = '60503'
> AND B.SEG_CDE = '5'
> There are 6 rows in the in "CODE_TABLE" .. while only 2 rows in
> "DATA_TABLE" will match.. so I want all 6 out of the "CODE_TABLE" every
> time (hence my OUTER JOIN...)
> my results look like this however:
> EMP_RNG_CDE
> 1
> 2
> -- I only received the rows that matched from B.. not the outer rows
> found in E!
> HOWEVER now this drives me really crazy.. if re-work the query to join
> my tables in the WHERE Clause and use the "*=" syntax for my left outer
> join.. it works just right!
> SELECT E.EMP_RNG_CDE
> FROM DATA_TABLE B, LEFT OUTER JOIN
> CODE_TABLE E
> WHERE E.EMP_RNG_CDE *= B.EMP_RNG_CDE AND
> B.POST_CDE = '60503'
> AND B.SEG_CDE = '5'
> Returns:
> EMP_RNG_CDE
> 1
> 2
> 3
> 4
> 5
> 6
> ? Aren't those 2 queries the same? Just with different syntax for the
> LEFT OUTER JOIN '
> Can somebody help me with why those two are different? And how I can
> get the results from the 2nd query with a "OUTER JOIN" syntax in my
> FROM clause?
> thansks!
> jeffpriz
>|||The preserved table is on the LEFT or the RIGHT side of the infixed
operator.
Here is how OUTER JOINs work in SQL-92. Assume you are given:
Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z
and the outer join expression:
Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== join condition
AND Table2.c = 't'; <== single table condition
We call Table1 the "preserved table" and Table2 the "unpreserved table"
in the query. What I am going to give you is a little different, but
equivalent to the ANSI/ISO standards.
1) We build the CROSS JOIN of the two tables. Scan each row in the
result set.
2) If the predicate tests TRUE for that row, then you keep it. You also
remove all rows derived from it from the CROSS JOIN
3) If the predicate tests FALSE or UNKNOWN for that row, then keep the
columns from the preserved table, convert all the columns from the
unpreserved table to NULLs and remove the duplicates.
So let us execute this by hand:
Let @. = passed the first predicate
Let * = passed the second predicate
Table1 CROSS JOIN Table2
a b a c
=========================
1 w 1 r @.
1 w 2 s
1 w 3 t *
2 x 1 r
2 x 2 s @.
2 x 3 t *
3 y 1 r
3 y 2 s
3 y 3 t @.* <== the TRUE set
4 z 1 r
4 z 2 s
4 z 3 t *
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
3 y 3 t <= only TRUE row
--
1 w NULL NULL Sets of duplicates
1 w NULL NULL
1 w NULL NULL
--
2 x NULL NULL
2 x NULL NULL
2 x NULL NULL
3 y NULL NULL <== derived from the TRUE set - Remove
3 y NULL NULL
--
4 z NULL NULL
4 z NULL NULL
4 z NULL NULL
the final results:
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
1 w NULL NULL
2 x NULL NULL
3 y 3 t
4 z NULL NULL
The basic rule is that every row in the preserved table is represented
in the results in at least one result row.
There are limitations and very serious problems with the extended
equality version of an outer join used in some diseased mutant
products. Consider the two Chris Date tables
Suppliers SupParts
supno supno partno qty
========= ==============
S1 S1 P1 100
S2 S1 P2 250
S3 S2 P1 100
S2 P2 250
and let's do an extended equality outer join like this:
SELECT *
FROM Supplier, SupParts
WHERE Supplier.supno *= SupParts.supno
AND qty < 200;
If I do the outer first, I get:
Suppliers LOJ SupParts
supno supno partno qty
=======================
S1 S1 P1 100
S1 S1 P2 250
S2 S2 P1 100
S2 S2 P2 250
S3 NULL NULL NULL
Then I apply the (qty < 200) predicate and get
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
Doing it in the opposite order
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
S3 NULL NULL NULL
Sybase does it one way, Oracle does it the other and Centura (nee
Gupta) lets you pick which one -- the worst of both non-standard
worlds! In SQL-92, you have a choice and can force the order of
execution. Either do the predicates after the join ...
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
WHERE qty < 200;
.. or do it in the joining:
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
AND qty < 200;
Another problem is that you cannot show the same table as preserved and
unpreserved in the extended equality version, but it is easy in SQL-92.
For example to find the students who have taken Math 101 and might
have taken Math 102:
SELECT C1.student, C1.math, C2.math
FROM (SELECT * FROM Courses WHERE math = 101) AS C1
LEFT OUTER JOIN
(SELECT * FROM Courses WHERE math = 102) AS C2
ON C1.student = C2.student;|||The WHERE clause causes your outer join to "become" an inner join. Not
explicitly, but in effect. (Put as simply as possible.)
Change your where clause like this or - as Jerry suggests - use RIGHT outer
join:
WHERE (B.POST_CDE = '60503' or B.POST_CDE is null)
AND (B.SEG_CDE = '5' or B.SEG_CDE is null)
ML|||Thanks! That actually helped a good bit! (it took me 5 reads, but I got
there!) LOL
I see where it's going. And i got my simple query to work, and I was
able to go back to my full-blown query to get it to work too
correctly..
thanks

No comments:

Post a Comment