Overview on Query Processing


SQL is a declarative language, where the user only specifies WHAT they want to get from the database, not HOWto get it. The how part is the responsibility of the database engine. Query processing is a complex task that involves several steps in order to translate a user’s query to something that the computer can actually execute in order to retrieve the results of that query.

Query processing usually consists of the following phases:
  1. Parsing
    In this phase, the engine analyzes the 
    text of the SQL query, to make sure that it is written correctly, and reports any syntax errors (misspellings, unidentified operators, etc.). If the query has no errors, it is translated into an internal representation that is to be used for later processing.
  2. Semantic Checking
    Now that the text of the query has no errors, the 
    semantics are checked. The engine makes sure that any referenced tables, columns, or views are valid. This phase is sometimes considered part of the parsing phase.
  3. Query Rewrite
    First of all, note that usually the same query may have multiple representations in SQL. Complex queries often result in redundancy, especially with 
    views. The Query rewriting phase rewrites a given SQL query into another query that is equivalent (produces the same result), but may be processed more efficiently.
  4. Query Optimization
    This is probably the most important phase in query processing, in which the engine tries to figure out the best way to evaluate the query (the “
    how” part). The same query can usually be evaluated in multiple ways, depending on which tables are joined first, what join methods are used, whether the predicates are tested at the beginning or at the end, etc. Each such way is called a query execution plan (or just a query plan). For the same query, some plans are more efficient than others. The job of the query optimizeris:
    1. Determine all possible query plans for the query in hand
    2. Estimate the execution cost for each of these plans. This is achieved using the statistics available in the database catalog
    3. Pick the plan with the least estimated cost
  5. Code Generation
    Once a query plan is chosen, the system translates this plan into executable machine code.
  6. Plan Execution
    Finally the query is executed, and the results are returned.

0 comments:

Post a Comment