Oracle SQL performance optimization

oracle large table performance tuning
how to check query performance in oracle
query tuning in oracle interview questions
performance tuning in oracle 11g
performance tuning in oracle 12c
how to improve order by performance in oracle
oracle performance tuning scenarios
oracle sql tuning book

I have two SQL statements whose performance I expect to be similar, but in fact SQL1 used 0.065 seconds and SQL2 used over 10 seconds with just 8000 records in total. Could anyone help to explain this? How can I optimize SQL2?

SQL 1:

select
    job_id,
    JOB_DESCRIPTION,
    REGEXP_COUNT(JOB_Description, '(ABC|DEF)([[:digit:]]){5}') as occurrences 
from smms.job 
where TO_NUMBER(to_char(CREATE_DATE,'YYYY')) = 2017;

SQL 2:

select job_id, JOB_Description 
from (
    select 
        job_id, 
        JOB_DESCRIPTION,
        REGEXP_COUNT(JOB_Description, '(ABC|DEF)([[:digit:]]){5}') as occurrences 
    from smms.job 
    where TO_NUMBER(to_char(CREATE_DATE,'YYYY')) = 2017
) 
where occurrences > 0;

SQL Tuning Overview, An important facet of database system performance tuning is the tuning of SQL statements. SQL tuning involves three basic steps: Identifying high load or top SQL  Here are the query optimization techniques in oracle which you could use to performance tune your Oracle database: 1. Start with the System Level SQL tuning. 2. Rewrite complex subqueries with temporary tables. 3. Index all predicates. 4. Use Inner Joins instead of Outer Joins. 5. Use CLOB/BLOB

As pointed out from Martin the issue is the expensive regexp_count function. So reducing the question is:

Why is:

  select * from (
  with dat as (select level lv, rpad('X',500,'X') txt from dual connect by level <= 20000)
  select lv, 
         REGEXP_COUNT(txt, '(ABC|DEF)([[:digit:]]){5}') as occurrences 
  from   dat 
  --where  REGEXP_COUNT(txt, '(ABC|DEF)([[:digit:]]){5}') > 1
  ) where rownum > 1

0.019 seconds and

  select * from (
  with dat as (select level lv, rpad('X',500,'X') txt from dual connect by level <= 20000)
  select lv, 
         REGEXP_COUNT(txt, '(ABC|DEF)([[:digit:]]){5}') as occurrences 
  from   dat 
  where  REGEXP_COUNT(txt, '(ABC|DEF)([[:digit:]]){5}') > 1
  ) where rownum > 1

6.7 seconds. Oracle evaluates the regexp_count in both executions. So there must be a difference in the evaluation in the where part and in the select part.

Optimizing SQL Statements, Database Performance Tuning Guide. Contents. Previous · Next. Page 18 of 31. Search. This Book This Release. Table of Contents. open Oracle Database  The primary performance attribute is compile time. Oracle determines at compile time whether a query would benefit from dynamic sampling. If so, a recursive SQL statement is issued to scan a small random sample of the table's blocks, and to apply the relevant single table predicates to estimate predicate selectivities.

At SQL1 it filters by (TO_NUMBER(to_char(CREATE_DATE,'YYYY')) = 2017) For the rows returned, executes (REGEXP_COUNT) per row

At SQL2 it filters by the result of (REGEXP_COUNT) which means that executes it against all table rows. Then, on that result, filters by (TO_NUMBER(to_char(CREATE_DATE,'YYYY')) = 2017)

To prove this, execute SQL1 without the filter. It will take approximately as much time as SQL2, maybe a little more.

To optimize you need to be 100% sure it will take SQL1 filter first. An absolute way would be to execute SQL1 and get the results into a temporary/memory table, then filter on them SQL2 filter

7 Performance Tuning, 12c Note: The Oracle 12c SQL Performance Analyzer (SPA), is primarily designed to speed up the holistic SQL tuning process. Once you create a workload (called  PL/SQL Optimizer. Prior to Oracle Database 10 g, the PL/SQL compiler translated your source text to system code without applying many changes to improve performance. Now, PL/SQL uses an optimizer that can rearrange code for better performance. The optimizer is enabled by default.

Oracle SQL tuning steps, Performance tuning of SQL Query for Oracle database is also a skill. Here are 32 tips to hone that skill for you. Let us know if you have any  SQL Optimization Techniques Edit on Bitbucket Before you start fidgeting with individual SQL statements, it is important to note that hints are probably the last thing you should consider adding when attempting to optimize your code.

32 Tips for Oracle SQL Query Writing and Performance Tuning, Best SQL query performance tuning tips and tricks with working examples. Learn how to properly write SQL queries. For Oracle SQL Query Tuning you are welcome to use our free SQL Query Tuning Tool. Rules for SQL query optimization: 1. SQL Performance Tuning team recommends using COUNT(1) instead COUNT(*) for SQL query performance optimization.

Optimizing SQL Performance, You are probably not going to make the SQL any faster, so you have to focus on ways of making it run less often, not query tuning. The SQL doesn't run often, but​  Optimizing SQL Statements. This part explains how to tune your SQL statements for optimal performance and discusses Oracle SQL-related performance tools. The chapters in this part include: Chapter 11, "The Query Optimizer" Chapter 12, "Using EXPLAIN PLAN" Chapter 13, "Managing Optimizer Statistics" Chapter 14, "Using Indexes and Clusters"

Comments
  • Although this is somewhat separate from your question, why does "SQL 2" use the subquery? occurrences is not used in the final select list, so I don't see why you don't just use REGEXP_COUNT directly in the WHERE clause. i.e. SELECT job_id, job_description FROM sims.job WHERE TO_NUMBER... =2017 AND REGEXP_COUNT... > 0
  • What are the execution plans of the queries? What will happen if TO_NUMBER(to_char(CREATE_DATE,'YYYY'))=2017 would be changed to CREATE_DATE >= DATE'2017-01-01' AND CREATE_DATE < DATE'2018-01-01'?
  • The clause you mentioned is my first version, but it also takes over 10 seconds, that's why I tried difference clause to optimize it.
  • I guess that version 1 is able to restrict the rather expensive regexp operation to a smaller resultset - the execution plans for both versions should show the difference (or maybe a sql trace with event 10046). Is there an index on create_date?
  • SQL3: select count(*) from smms.job where REGEXP_COUNT(JOB_Description, '(ABC|DEF)([[:digit:]]){5}')>0, it takes 10 seconds as well, which means the performance has nothing to do with CREATE_DATE filter, it wholly depends on the REGEXP_COUNT clause.
  • Yes, you are correct! if I use /*+ materialize */ on TO_NUMBER(to_char(CREATE_DATE,'YYYY')) = 2017 with a CTE first, then apply the REGEXP_COUNT, it greatly improves the performance, without the materialize hints, no matter how I reorganize the filter clauses, the execution plan and processing time are the same. Thank you so much!
  • I don't quite understand your point, but rownum > 1 can materialize the SQL, thus improving the performance.
  • rownum > 1 is not to improve the performance. It's to check the time a statement needs without the need to go to the end of the cursor. So in query 1 and 2 is done the same and in 2 it's factors slower.