SELECT
clause: Only request the columns you need, as it reduces the amount of data the query engine needs to process.
LIMIT
clause: If you are only interested in a specific number of rows, use the LIMIT
clause to avoid processing more data than necessary.
UNION ALL
instead of UNION
: If you’re combining the results of multiple queries, use UNION ALL
instead of UNION
to avoid the overhead of removing duplicate rows.
ORDER BY
.
UNION ALL
instead of OR
: If you need to combine the results of multiple queries, use UNION ALL
instead of OR
to avoid the overhead of removing duplicate rows. For example, instead of using SELECT * FROM table WHERE column = 'value1' OR column = 'value2'
, use SELECT * FROM table WHERE column = 'value1' UNION ALL SELECT * FROM table WHERE column = 'value2'
.
SELECT * FROM transactions WHERE from_address = lower('0x1234567890ABCDEF')
, use SELECT * FROM transactions WHERE from_address = '0x1234567890abcdef'
. This allows the query engine to use the min/max values of the from_address column to skip entire ORC files or row groups within files while scanning through a table.