<input id="ohw05"></input>
  • <table id="ohw05"><menu id="ohw05"></menu></table>
  • <var id="ohw05"></var>
  • <code id="ohw05"><cite id="ohw05"></cite></code>
    <label id="ohw05"></label>
    <var id="ohw05"></var>
  • openGauss內核:SQL解析過程分析

    摘要:在傳統數據庫中SQL引擎一般指對用戶輸入的SQL語句進行解析、優化的軟件模塊。SQL的解析過程主要分為:詞法、語法和語義分析。

    本文分享自華為云社區《 openGauss內核分析(三):SQL解析》,作者:Gauss松鼠會。

    在傳統數據庫中SQL引擎一般指對用戶輸入的SQL語句進行解析、優化的軟件模塊。

    SQL的解析過程主要分為:

    ? 詞法分析:將用戶輸入的SQL語句拆解成單詞(Token)序列,并識別出關鍵字、標識、常量等。

    ? 語法分析:分析器對詞法分析器解析出來的單詞(Token)序列在語法上是否滿足SQL語法規則。

    ? 語義分析:語義分析是SQL解析過程的一個邏輯階段,主要任務是在語法正確的基礎上進行上下文有關性質的審查,在SQL解析過程中該階段完成表名、操作符、類型等元素的合法性判斷,同時檢測語義上的二義性。

    openGauss在pg_parse_query中調用raw_parser函數對用戶輸入的SQL命令進行詞法分析和語法分析,生成語法樹添加到鏈表parsetree_list中。完成語法分析后,對于parsetree_list中的每一顆語法樹parsetree,會調用parse_**yze函數進行語義分析,根據SQL命令的不同,執行對應的入口函數,最終生成查詢樹。

    詞法分析

    openGauss使用flex工具進行詞法分析。flex工具通過對已經定義好的詞法文件進行編譯,生成詞法分析的代碼。詞法文件是scan.l,它根據SQL語言標準對SQL語言中的關鍵字、標識符、操作符、常量、終結符進行了定義和識別。在kwlist.h中定義了大量的關鍵字,按照字母的順序排列,方便在查找關鍵字時通過二分法進行查找。 在scan.l中處理“標識符”時,會到關鍵字列表中進行匹配,如果一個標識符匹配到關鍵字,則認為是關鍵字,否則才是標識符,即關鍵字優先. 以“select a, b from item”為例說明詞法分析結果。

    語法分析

    openGauss中定義了bison工具能夠識別的語法文件gram.y,根據SQL語言的不同定義了一系列表達Statement的結構體(這些結構體通常以Stmt作為命名后綴),用來保存語法分析結果。以SELECT查詢為例,它對應的Statement結構體如下。

    typedef struct SelectStmt
    {
        NodeTag        type;
        List       *distinctClause; /* NULL, list of DISTINCT ON exprs, or
                                     * lcons(NIL,NIL) for all (SELECT DISTINCT) */
        IntoClause *intoClause;        /* target for SELECT INTO */
        List       *targetList;        /* the target list (of ResTarget) */
        List       *fromClause;        /* the FROM clause */
        Node       *whereClause;    /* WHERE qualification */
        List       *groupClause;    /* GROUP BY clauses */
        Node       *havingClause;    /* HAVING conditional-expression */
        List       *windowClause;    /* WINDOW window_name AS (...), ... */
        WithClause *withClause;        /* WITH clause */
        List       *valuesLists;    /* untransformed list of expression lists */
        List       *sortClause;        /* sort clause (a list of SortBy's) */
        Node       *limitOffset;    /* # of result tuples to skip */
        Node       *limitCount;        /* # of result tuples to return */
        ……
    } SelectStmt;

    這個結構體可以看作一個多叉樹,每個葉子節點都表達了SELECT查詢語句中的一個語法結構,對應到gram.y中,它會有一個SelectStmt。代碼如下:

    從simple_select語法分析結構可以看出,一條簡單的查詢語句由以下子句組成:去除行重復的distinctClause、目標屬性targetList、SELECT INTO子句intoClause、FROM子句fromClause、WHERE子句whereClause、GROUP BY子句groupClause、HAVING子句havingClause、窗口子句windowClause和plan_hint子句。在成功匹配simple_select語法結構后,將會創建一個Statement結構體,將各個子句進行相應的賦值。對simple_select而言,目標屬性、FROM子句、WHERE子句是最重要的組成部分。SelectStmt與其他結構體的關系如下:

    下面以“select a, b from item”為例說明簡單select語句的解析過程,函數exec_simple_query調用pg_parse_query執行解析,解析樹中只有一個元素。

    (gdb) p *parsetree_list
    $47 = {type = T_List, length = 1, head = 0x7f5ff986c8f0, tail = 0x7f5ff986c8f0}

    List中的節點類型為T_SelectStmt。

    (gdb) p *(Node *)(parsetree_list->head.data->ptr_value)
    $45 = {type = T_SelectStmt}

    查看SelectStmt結構體,targetList 和fromClause非空。

    (gdb) set $stmt = (SelectStmt *)(parsetree_list->head.data->ptr_value)
    (gdb) p *$stmt
    $50 = {type = T_SelectStmt, distinctClause = 0x0, intoClause = 0x0, targetList = 0x7f5ffa43d588, fromClause = 0x7f5ff986c888, startWithClause = 0x0, whereClause = 0x0, groupClause = 0x0,
      havingClause = 0x0, windowClause = 0x0, withClause = 0x0, valuesLists = 0x0, sortClause = 0x0, limitOffset = 0x0, limitCount = 0x0, lockingClause = 0x0, hintState = 0x0, op = SETOP_NONE, all = false,
      larg = 0x0, rarg = 0x0, hasPlus = false}

    查看SelectStmt的targetlist,有兩個ResTarget。

    (gdb) p *($stmt->targetList)
    $55 = {type = T_List, length = 2, head = 0x7f5ffa43d540, tail = 0x7f5ffa43d800}
    (gdb) p *(Node *)($stmt->targetList->head.data->ptr_value)
    $57 = {type = T_ResTarget}
    
    (gdb) set $restarget1=(ResTarget *)($stmt->targetList->head.data->ptr_value)
    (gdb) p *$restarget1
    $60 = {type = T_ResTarget, name = 0x0, indirection = 0x0, val = 0x7f5ffa43d378, location = 7}
    (gdb) p *$restarget1->val
    $63 = {type = T_ColumnRef}
    (gdb) p *(ColumnRef *)$restarget1->val
    $64 = {type = T_ColumnRef, fields = 0x7f5ffa43d470, prior = false, indnum = 0, location = 7}
    (gdb) p *((ColumnRef *)$restarget1->val)->fields
    $66 = {type = T_List, length = 1, head = 0x7f5ffa43d428, tail = 0x7f5ffa43d428}
    (gdb) p *(Node *)(((ColumnRef *)$restarget1->val)->fields)->head.data->ptr_value
    $67 = {type = T_String}
    (gdb) p *(Value *)(((ColumnRef *)$restarget1->val)->fields)->head.data->ptr_value
    $77 = {type = T_String, val = {ival = 140050197369648, str = 0x7f5ffa43d330 "a"}}
    (gdb) set $restarget2=(ResTarget *)($stmt->targetList->tail.data->ptr_value)
    (gdb) p *$restarget2
    $89 = {type = T_ResTarget, name = 0x0, indirection = 0x0, val = 0x7f5ffa43d638, location = 10}
    (gdb) p *$restarget2->val
    $90 = {type = T_ColumnRef}
    (gdb) p *(ColumnRef *)$restarget2->val
    $91 = {type = T_ColumnRef, fields = 0x7f5ffa43d730, prior = false, indnum = 0, location = 10}
    (gdb) p *((ColumnRef *)$restarget2->val)->fields
    $92 = {type = T_List, length = 1, head = 0x7f5ffa43d6e8, tail = 0x7f5ffa43d6e8}
    (gdb) p *(Node *)(((ColumnRef *)$restarget2->val)->fields)->head.data->ptr_value
    $93 = {type = T_String}
    (gdb) p *(Value *)(((ColumnRef *)$restarget2->val)->fields)->head.data->ptr_value
    $94 = {type = T_String, val = {ival = 140050197370352, str = 0x7f5ffa43d5f0 "b"}}

    查看SelectStmt的fromClause,有一個RangeVar。

    (gdb) p *$stmt->fromClause
    $102 = {type = T_List, length = 1, head = 0x7f5ffa43dfe0, tail = 0x7f5ffa43dfe0}
    (gdb) set $fromclause=(RangeVar*)($stmt->fromClause->head.data->ptr_value)
    (gdb) p *$fromclause
    $103 = {type = T_RangeVar, catalogname = 0x0, schemaname = 0x0, relname = 0x7f5ffa43d848 "item", partitionname = 0x0, subpartitionname = 0x0, inhOpt = INH_DEFAULT, relpersistence = 112 'p', alias = 0x0,
      location = 17, ispartition = false, issubpartition = false, partitionKeyValuesList = 0x0, isbucket = false, buckets = 0x0, length = 0, foreignOid = 0, withVerExpr = false}

    綜合以上分析可以得到語法樹結構。

    語義分析

    在完成詞法分析和語法分析后,parse_Ana lyze函數會根據語法樹的類型,調用transformSelectStmt將parseTree改寫為查詢樹。

    (gdb) p *result
    $3 = {type = T_Query, commandType = CMD_SELECT, querySource = QSRC_ORIGINAL, queryId = 0, canSetTag = false, utilityStmt = 0x0, resultRelation = 0, hasAggs = false, hasWindowFuncs = false,
      hasSubLinks = false, hasDistinctOn = false, hasRecursive = false, hasModifyingCTE = false, hasForUpdate = false, hasRowSecurity = false, hasSynonyms = false, cteList = 0x0, rtable = 0x7f5ff5eb8c88,
      jointree = 0x7f5ff5eb9310, targetList = 0x7f5ff5eb9110,…}
    
    (gdb) p *result->targetList
    $13 = {type = T_List, length = 2, head = 0x7f5ff5eb90c8, tail = 0x7f5ff5eb92c8}
    
    (gdb) p *(Node *)(result->targetList->head.data->ptr_value)
    $8 = {type = T_TargetEntry}
    (gdb) p *(TargetEntry*)(result->targetList->head.data->ptr_value)
    $9 = {xpr = {type = T_TargetEntry, selec = 0}, expr = 0x7f5ff636ff48, resno = 1, resname = 0x7f5ff5caf330 "a", ressortgroupref = 0, resorigtbl = 24576, resorigcol = 1, resjunk = false}
    (gdb) p *(TargetEntry*)(result->targetList->tail.data->ptr_value)
    $10 = {xpr = {type = T_TargetEntry, selec = 0}, expr = 0x7f5ff5eb9178, resno = 2, resname = 0x7f5ff5caf5f0 "b", ressortgroupref = 0, resorigtbl = 24576, resorigcol = 2, resjunk = false}
    (gdb)
    (gdb) p *result->rtable
    $14 = {type = T_List, length = 1, head = 0x7f5ff5eb8c40, tail = 0x7f5ff5eb8c40}
    (gdb)  p *(Node *)(result->rtable->head.data->ptr_value)
    $15 = {type = T_RangeTblEntry}
    (gdb) p *(RangeTblEntry*)(result->rtable->head.data->ptr_value)
    $16 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relname = 0x7f5ff636efb0 "item", partAttrNum = 0x0, relid = 24576, partitionOid = 0, isContainPartition = false, subpartitionOid = 0……}

    得到的查詢樹結構如下:

    完成詞法、語法和語義分析后,SQL解析過程完成,SQL引擎開始執行查詢優化,在下一期中再具體分析。

     

    點擊關注,第一時間了解華為云新鮮技術~

    posted @ 2022-06-28 14:09  華為云開發者聯盟  閱讀(35)  評論(0編輯  收藏  舉報
    国产美女a做受大片观看