Friday, February 24, 2012

Can someone help me with multiple "Left Outer Joins"?

I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three
"Left Outer Joins" in order to return every transaction for a specific
set of criteria.

Using three "Left Outer Joins" slows the system down considerably.

I've tried creating a temp db, but I can't figure out how to execute
two select commands. (It throws the exception "The column prefix
'tempdb' does not match with a table name or alias name used in the
query.")

Looking for suggestions (and a lesson or two!) This is my first attempt
at SQL.

Current (working, albeit slowly) Query Below

TIA

SELECT

LEDGER_ENTRY.entry_amount,
LEDGER_TRANSACTION.credit_card_exp_date,
LEDGER_ENTRY.entry_datetime,
LEDGER_ENTRY.employee_id,
LEDGER_ENTRY.voucher_explanation,
LEDGER_ENTRY.card_reader_used_ind,
STAY.room_id,
GUEST.guest_lastname,
GUEST.guest_firstname,
STAY.arrival_time,
STAY.departure_time,
STAY.arrival_date,
STAY.original_departure_date,
STAY.no_show_status,
STAY.cancellation_date,
FOLIO.house_acct_id,
FOLIO.group_code,
LEDGER_TRANSACTION.original_receipt_id

FROM

mydb.dbo.LEDGER_ENTRY LEDGER_ENTRY,
mydb.dbo.LEDGER_TRANSACTION LEDGER_TRANSACTION,

mydb.dbo.FOLIO FOLIO
LEFT OUTER JOIN
mydb.dbo.STAY_FOLIO STAY_FOLIO
ON
FOLIO.folio_id = STAY_FOLIO.folio_id
LEFT OUTER JOIN
mydb.dbo.STAY STAY
ON
STAY_FOLIO.stay_id = STAY.stay_id
LEFT OUTER JOIN
mydb.dbo.GUEST GUEST
ON
FOLIO.guest_id = GUEST.guest_id

WHERE

LEDGER_ENTRY.trans_id = LEDGER_TRANSACTION.trans_id
AND FOLIO.folio_id = LEDGER_TRANSACTION.folio_id
AND LEDGER_ENTRY.payment_method='3737******6100'
AND LEDGER_ENTRY.property_id='abc123'

ORDER BY

LEDGER_ENTRY.entry_datetime DESCWhat is actually your question :-) ?

Jens Suessmeyer.|||My question is, Can this query be further optimized for speed?

I have tried creating temporary databases, to break-up the outer joins
into different select commands, but I couldn't get it to work properly.
I'm using VB6 and ADO, invoking the execute method of the adodb.command
object to return the recordset.|||To take access to different databases and tables you may
use for example syntax like this:
DatabaseName.TableName.ColumnName

I do not see why you would need to create a
temporal db and why this would help you
with performance.

I wonder if you meant a temporal table instead.

In general I experienced, that views (depending on what the do) may slow
down the whole query.
also ORDER BY.

I suggest you break your select statement in three peaces so you may
see with the profiler wich join would take the most of time.
maybe by applying an index to specific columns you get a bit more
performance.

if you watch the query from your VB-application, you wil have to
differ between the time thats used by your application and ADO
and the time the Database itself needs.
the bottleneck could also be at the application-side!

Hope this gave some hints.

Sonja

"Steve" <budgethelp@.yahoo.com> schrieb im Newsbeitrag
news:1126754368.398119.129660@.o13g2000cwo.googlegr oups.com...
>I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three
> "Left Outer Joins" in order to return every transaction for a specific
> set of criteria.
> Using three "Left Outer Joins" slows the system down considerably.
> I've tried creating a temp db, but I can't figure out how to execute
> two select commands. (It throws the exception "The column prefix
> 'tempdb' does not match with a table name or alias name used in the
> query.")
> Looking for suggestions (and a lesson or two!) This is my first attempt
> at SQL.
> Current (working, albeit slowly) Query Below
> TIA
> SELECT
> LEDGER_ENTRY.entry_amount,
> LEDGER_TRANSACTION.credit_card_exp_date,
> LEDGER_ENTRY.entry_datetime,
> LEDGER_ENTRY.employee_id,
> LEDGER_ENTRY.voucher_explanation,
> LEDGER_ENTRY.card_reader_used_ind,
> STAY.room_id,
> GUEST.guest_lastname,
> GUEST.guest_firstname,
> STAY.arrival_time,
> STAY.departure_time,
> STAY.arrival_date,
> STAY.original_departure_date,
> STAY.no_show_status,
> STAY.cancellation_date,
> FOLIO.house_acct_id,
> FOLIO.group_code,
> LEDGER_TRANSACTION.original_receipt_id
> FROM
> mydb.dbo.LEDGER_ENTRY LEDGER_ENTRY,
> mydb.dbo.LEDGER_TRANSACTION LEDGER_TRANSACTION,
> mydb.dbo.FOLIO FOLIO
> LEFT OUTER JOIN
> mydb.dbo.STAY_FOLIO STAY_FOLIO
> ON
> FOLIO.folio_id = STAY_FOLIO.folio_id
> LEFT OUTER JOIN
> mydb.dbo.STAY STAY
> ON
> STAY_FOLIO.stay_id = STAY.stay_id
> LEFT OUTER JOIN
> mydb.dbo.GUEST GUEST
> ON
> FOLIO.guest_id = GUEST.guest_id
> WHERE
> LEDGER_ENTRY.trans_id = LEDGER_TRANSACTION.trans_id
> AND FOLIO.folio_id = LEDGER_TRANSACTION.folio_id
> AND LEDGER_ENTRY.payment_method='3737******6100'
> AND LEDGER_ENTRY.property_id='abc123'
> ORDER BY
> LEDGER_ENTRY.entry_datetime DESC|||Yes, I meant that I tried to create a temporary table, not db, sorry...

Regarding the 3 joins...would putting parenthesis around any of them
help?
How are they being processed exactly?
The first join has a single table reference immediately preceding the
join statement, but the others cannot (is that correct?)
What are the next two joins being joined to exacty (since there is no
table specified before the two join statements?

The tables that I'm joining look like this:

--FOLIO-----STAY FOLIO------STAY
|
|
|___________GUEST

All transactions have FOLIO records, but not all transactions have STAY
FOLIO, STAY, OR GUEST records.
I need to return all transactions that have a folio record.

This is the syntax I'm using to accomplish this:

mydb.dbo.FOLIO FOLIO
LEFT OUTER JOIN
mydb.dbo.STAY_FOLIO STAY_FOLIO
ON
FOLIO.folio_id = STAY_FOLIO.folio_id
LEFT OUTER JOIN
mydb.dbo.STAY STAY
ON
STAY_FOLIO.stay_id = STAY.stay_id
LEFT OUTER JOIN
mydb.dbo.GUEST GUEST
ON
FOLIO.guest_id = GUEST.guest_id

I don't understand how the order of the joins affects their processing.
Is there a better way to phrase the joins, given the table
relationships as outlined above?

Thanks!|||> Yes, I meant that I tried to create a temporary table, not db, sorry...

this would be accomplished with views. Like I mentioned before
but this may not be a Solution for your problem.

As I know a lot of Select Squences with a lot more Joins
than you need here, I do not believe that your performance-problem
results from the sql statement.
Depending on the server-machine your Database is installed on,
there may be different reasons, why this query takes a long time.

1)Maybe your tables are big. Lets asume each of them has 1 000 000 tuples.
Even then the query should not last (DEPENDING ON YOUR MACHINE)
a "long" time.
If this Machine is for example the whole time working on a 70% level
it slows down everything to death.
If the machine has enough breath to acomplish your query and you are testing
just solely we leave this section ...

2)the dbms tries to optimize sql -queries by itself, to make them faster, if
you want to
optimize more, use only the lines and columns you seek. It makes the whole
thing
a little faster if you simply snip columns and rows that you do not need.

3) it may help with performance to apply indexes to columns that will be
joined

4) Your application is getting all data over network one by one and
everything
slows down. Then its not a database or query -problem

5) use the SQL Profiler to see where the bottleneck is.

If you like, create for each join a view and then simply join the view with
folio
like this for example
------------
CREATE VIEW stay_test AS
Select Stay_folio.folio_id from
STAY_FOLIO left outer join STAY
ON
stay_folio.stay_id = stay.stay_id
-----------
SELECT * FROM
folio LEFT OUTER JOIN stay_test
ON
folio.folio_id = stay_test.folio_id
LEFT OUTER JOIN guest
ON
folio.guest_id = guest.guest_id
-----------

At the SQL profiler you can view each selection that is made and how long it
takes to get result

> Regarding the 3 joins...would putting parenthesis around any of them
> help?
> How are they being processed exactly?
> The first join has a single table reference immediately preceding the
> join statement, but the others cannot (is that correct?)
> What are the next two joins being joined to exacty (since there is no
> table specified before the two join statements?
> The tables that I'm joining look like this:
> --FOLIO-----STAY FOLIO------STAY
> |
> |
> |___________GUEST
> All transactions have FOLIO records, but not all transactions have STAY
> FOLIO, STAY, OR GUEST records.
> I need to return all transactions that have a folio record.
> This is the syntax I'm using to accomplish this:
> mydb.dbo.FOLIO FOLIO
> LEFT OUTER JOIN
> mydb.dbo.STAY_FOLIO STAY_FOLIO
> ON
> FOLIO.folio_id = STAY_FOLIO.folio_id
> LEFT OUTER JOIN
> mydb.dbo.STAY STAY
> ON
> STAY_FOLIO.stay_id = STAY.stay_id
> LEFT OUTER JOIN
> mydb.dbo.GUEST GUEST
> ON
> FOLIO.guest_id = GUEST.guest_id
> I don't understand how the order of the joins affects their processing.
> Is there a better way to phrase the joins, given the table
> relationships as outlined above?
> Thanks!|||On 14 Sep 2005 20:19:28 -0700, Steve wrote:

>I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three
>"Left Outer Joins" in order to return every transaction for a specific
>set of criteria.
>Using three "Left Outer Joins" slows the system down considerably.

Hi Steve,

That need not be the case. I guess that adding the right indexes would
help a lot.

>I've tried creating a temp db, but I can't figure out how to execute
>two select commands. (It throws the exception "The column prefix
>'tempdb' does not match with a table name or alias name used in the
>query.")

I could help you with solving this problem, but I won't. Breaking a
query in smaller pieces with temp tables has a fair chance to hurt your
performance, and very limited chance to do any good.

The query optimizer can use all the tricks that you can use, and then
some. Better to trust that the optimizer will pick the right execution
plan from the flock of available options instead of forcing it to do the
way you think is best. There ARE cases where the optimizer does need
some guidance, but they are the exception rather than the rule.

>Current (working, albeit slowly) Query Below

Thanks for posting the query, but you'll have to provide a lot more
information to enable us to help you. We need to know the structure of
your tables (posted as CREATE TABLE statements, including all properties
and constraints, but excluding irrelevant columns), the indexes you have
defined for your tables, if any (posted as CREATE INDEX statements), a
few rows of sample data (posted as INSERT statements) and the expected
results from that sample data to give us an idea what you're trying to
achieve. Including a short description of your actual business problem
is a great idea too. See www.aspfaq.com/5006 for some useful pointers on
hjow to assemble the information we need, in the best format.

Oh, and we'd also like to know how many rows (approximately) you have in
each of your tables - and the execution plan that is currently used for
your query (you can get the execution plan if you run the query with SET
SHOWPLAN_ALL ON.

With that information, we can try to find out why your current query is
running slow, and how to remedy that.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)|||Steve (budgethelp@.yahoo.com) writes:
> I have a SQL query I'm invoking via VB6 & ADO 2.8, that requires three
> "Left Outer Joins" in order to return every transaction for a specific
> set of criteria.
> Using three "Left Outer Joins" slows the system down considerably.
> I've tried creating a temp db, but I can't figure out how to execute
> two select commands. (It throws the exception "The column prefix
> 'tempdb' does not match with a table name or alias name used in the
> query.")
> Looking for suggestions (and a lesson or two!) This is my first attempt
> at SQL.

As Hugo pointed out, it is impossible to give very precise advice from
from the information you have posted. Assuming that there is an index
on (payment_method, property_id) on LEDGER_ENTRY, and that all other
tables have indexes on the columns you join on, I would expect the query
to perform well. Then again, there can be several reasons to why it does
not.

I analysed your query, and I think that I found one flaw. Here is a
rewritten version:

SELECT LE.entry_amount, LT.credit_card_exp_date, LE.entry_datetime,
LE.employee_id, LE.voucher_explanation, LE.card_reader_used_ind,
S.room_id, G.guest_lastname, G.guest_firstname, S.arrival_time,
S.departure_time, S.arrival_date, S.original_departure_date,
S.no_show_status, S.cancellation_date, F.house_acct_id,
F.group_code, LT.original_receipt_id
FROM mydb.dbo.LEDGER_ENTRY LE
JOIN mydb.dbo.LEDGER_TRANSACTON LT ON LE.trans_id = LT.trans_id
JOIN mydb.dbo.FOLIO F ON F.folio_id = LT.folio_id
LEFT JOIN (mydb.dbo.STAY_FOLIO SF
JOIN mydb.dbo.STAY S ON SF.stay_id = S.stay_id)
ON F.folio_id = SF.folio_id
LEFT JOIN mydb.dbo.GUEST G ON F.guest_id = G.guest_id
WHERE LE.payment_method='3737******6100'
AND LE.property_id='abc123'
ORDER BY LE.entry_datetime DESC

This alters the semantics of the query slightly, and I guess to the
good. Whether it affects performance, I don't know.

One potential problem is if the joins from FOLIO to STAY_FOLIO and GUEST
could hit multiple rows in the latter tables. In such case you get too
many rows back, which also could cause poor performance.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment