<input id="ohw05"></input>
  • <table id="ohw05"><menu id="ohw05"></menu></table>
  • <var id="ohw05"></var>
  • <code id="ohw05"><cite id="ohw05"></cite></code>
    <label id="ohw05"></label>
    <var id="ohw05"></var>
  • 簡單sql字段解析器實現參考

      用例:有一段sql語句,我們需要從中截取出所有字段部分,以便進行后續的類型推斷或者別名字段抽取定義,請給出此解析方法。

      想來很簡單吧,因為 sql 中的字段列表,使用方式有限,比如 a as b, a, a b...

     

    1. 解題思路

      如果不想做復雜處理,最容易想到的,就是直接用某個特征做分割即可。比如,先截取出 字段列表部分,然后再用逗號',' 分割,就可以得到一個個的字段了。然后再要細分,其實只需要用 as 進行分割就可以了。

      看起來好像可行,但是存在許多漏洞,首先,這里面有太多的假設:各種截取部分要求必須符合要求,必須沒有多余的逗號,必須要有as 等等。這明顯不符合要求了。

      其二,我們可以換一種轉換方式。比如先截取到field部分,然后先以 as 分割,再以逗號分割,然后取最后一個詞作為field。 

      看起來好像更差了,截取到哪里已經完全不知道了。即原文已經被破壞殆盡,而且同樣要求要有 as 轉換標簽,而且對于函數覬覦有 as 的場景,就完全錯誤了。

      其三,最好還是自行一個個單詞地解析,field 字段無外乎幾種情況,1. 普通字段如 select a; 2. 帶as的普通字段如 select a as b; 3. 帶函數的字段如 select coalesce(a, b); 4. 帶函數且帶as的字段如 select coalesce(a, b) ab; 5. 函數內帶as的字段如 select cast(a as string) b; ...   我們只需依次枚舉對應的情況,就可以將字段解析出來了。

      看起來是個不錯的想法。但是具體實現如何?

     

    2. 具體解析實現

      主要分兩個部分,1. 需要定義一個解析后的結果數據結構,以便清晰描述字段信息; 2. 分詞解析sql并以結構體返回;

      我們先來看看整個算法核心:

    /**
     * 功能描述: 簡單sql字段解析器
     *
     *        樣例如1:
     *          select COALESCE(t1.xno, t2.xno, t3.xno) as xno,
     *             case when t1.no is not null then 1 else null end as xxk001,
     *             case when t2.no is not null then 1 else null end as xxk200,
     *             case when t3.xno is not null then 1 else null end as xx3200
     *             from xxk001 t1
     *               full join xxkj100 t2 on t1.xno = t2.xno
     *               full join xxkj200 t3 on t1.xno = t3.xno;
     *
     *        樣例如2:
     *          select cast(a as string) as b from ccc;
     *
     *        樣例如3:
     *          with a as(select cus,x1 from b1), b as (select cus,x2 from b2)
     *              select a.cus as a_cus from a join b on a.cus=b.cus where xxx;
     *
     *        樣例如4:
     *         select a.xno,b.xx from a_tb as a join b_tb as b on a.id = b.id
     *
     *        樣例如5:
     *          select cast  \t(a as string) a_str, cc (a as double) a_double from x
     *
     */
    public class SimpleSqlFieldParser {
    
        /**
         * 解析一段次標簽sql 中的字段列表
         *
         * @param sql 原始sql, 需如 select xx from xxx join ... 格式
         * @return 字段列表
         */
        public static List<SelectFieldClauseDescriptor> parse(String sql) {
            String columnPart = adaptFieldPartSql(sql);
            int deep = 0;
            List<StringBuilder> fieldTokenSwap = new ArrayList<>();
            StringBuilder currentTokenBuilder = new StringBuilder();
            List<SelectFieldClauseDescriptor> fieldList = new ArrayList<>();
            fieldTokenSwap.add(currentTokenBuilder);
            int len = columnPart.length();
            char[] columnPartChars = columnPart.toCharArray();
            for(int i = 0; i < len; i++) {
                // 空格忽略,換行忽略,tab忽略
                // 字符串相接
                // 左(號入棧,++deep;
                // 右)號出棧,--deep;
                // deep>0 忽略所有其他直接拼接
                // as 則取下一個值為fieldName
                // case 則直接取到end為止;
                //,號則重置token,構建結果集
                char currentChar = columnPartChars[i];
                switch (currentChar) {
                    case '(':
                        ++deep;
                        currentTokenBuilder.append(currentChar);
                        break;
                    case ')':
                        --deep;
                        currentTokenBuilder.append(currentChar);
                        break;
                    case ',':
                        if(deep == 0) {
                            addNewField(fieldList, fieldTokenSwap, true);
                            fieldTokenSwap = new ArrayList<>();
                            currentTokenBuilder = new StringBuilder();
                            fieldTokenSwap.add(currentTokenBuilder);
                            break;
                        }
                        currentTokenBuilder.append(currentChar);
                        break;
                    case ' ':
                    case '\t':
                    case '\r':
                    case '\n':
                        if(deep > 0) {
                            currentTokenBuilder.append(currentChar);
                            continue;
                        }
                        if(currentTokenBuilder.length() == 0) {
                            continue;
                        }
                        // original_name as   --> alias
                        if(i + 1 < len) {
                            int j = i + 1;
                            // 收集連續的空格
                            StringBuilder spaceHolder = new StringBuilder();
                            boolean isNextLeftBracket = false;
                            do {
                                char nextChar = columnPart.charAt(j++);
                                if(nextChar == ' ' || nextChar == '\t'
                                        || nextChar == '\r' || nextChar == '\n') {
                                    spaceHolder.append(nextChar);
                                    continue;
                                }
                                if(nextChar == '(') {
                                    isNextLeftBracket = true;
                                }
                                break;
                            } while (j < len);
                            if(isNextLeftBracket) {
                                currentTokenBuilder.append(currentChar);
                            }
                            if(spaceHolder.length() > 0) {
                                currentTokenBuilder.append(spaceHolder);
                                i += spaceHolder.length();
                            }
                            if(isNextLeftBracket) {
                                // continue next for, function begin
                                continue;
                            }
                        }
                        if(fieldTokenSwap.size() == 1) {
                            if(fieldTokenSwap.get(0).toString().equalsIgnoreCase("case")) {
                                String caseWhenPart = CommonUtil.readSplitWord(
                                        columnPartChars, i, " ", "end");
                                currentTokenBuilder.append(caseWhenPart);
                                if(caseWhenPart.length() <= 0) {
                                    throw new BizException("語法錯誤,未找到case..when的結束符");
                                }
                                i += caseWhenPart.length();
                            }
                        }
                        addNewField(fieldList, fieldTokenSwap, false);
                        currentTokenBuilder = new StringBuilder();
                        fieldTokenSwap.add(currentTokenBuilder);
                        break;
                        // 空格忽略
                    default:
                        currentTokenBuilder.append(currentChar);
                        break;
                }
    
            }
            // 處理剩余尚未存儲的字段信息
            addNewField(fieldList, fieldTokenSwap, true);
            return fieldList;
        }
    
        /**
         * 新增一個字段描述
         *
         * @param fieldList 字段容器
         * @param fieldTokenSwap 候選詞
         */
        private static void addNewField(List<SelectFieldClauseDescriptor> fieldList,
                                        List<StringBuilder> fieldTokenSwap,
                                        boolean forceAdd) {
            int ts = fieldTokenSwap.size();
            if(ts == 1 && forceAdd) {
                // db.original_name,
                String fieldName = fieldTokenSwap.get(0).toString();
                String alias = fieldName;
                if(fieldName.contains(".")) {
                    alias = fieldName.substring(fieldName.lastIndexOf('.') + 1);
                }
                fieldList.add(new SelectFieldClauseDescriptor(fieldName, alias));
                return;
            }
            if(ts < 2) {
                return;
            }
            if(ts == 2) {
                // original_name alias,
                if(fieldTokenSwap.get(1).toString().equalsIgnoreCase("as")) {
                    return;
                }
                fieldList.add(new SelectFieldClauseDescriptor(
                        fieldTokenSwap.get(0).toString(),
                        fieldTokenSwap.get(1).toString()));
            }
            else if(ts == 3) {
                // original_name as alias,
                fieldList.add(new SelectFieldClauseDescriptor(
                        fieldTokenSwap.get(0).toString(),
                        fieldTokenSwap.get(2).toString()));
            }
            else {
                throw new BizException("字段語法解析錯誤,超過3個以字段描述信息:" + ts);
            }
        }
    
        // 截取適配 field 字段信息部分
        private static String adaptFieldPartSql(String fullSql) {
            int start = fullSql.lastIndexOf("select ");
            int end = fullSql.lastIndexOf(" from");
            String columnPart = fullSql.substring(start + "select ".length(), end);
            return columnPart.trim();
        }
    
    }

      應該說是比較簡單的,一個for, 一個 switch ,就搞定了。其他的,更多的是邏輯判定。

      下面我們來看看字段描述類的寫法,其實就是兩個字段,源字段和別名。

    /**
     * 功能描述: sql字段描述 select 字段描述類
     *
     */
    public class SelectFieldClauseDescriptor {
        private String fieldName;
        private String alias;
    
        public SelectFieldClauseDescriptor(String fieldName, String alias) {
            this.fieldName = fieldName;
            this.alias = alias;
        }
    
        public String getFieldName() {
            return fieldName;
        }
    
        public String getAlias() {
            return alias;
        }
    
    
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            SelectFieldClauseDescriptor that = (SelectFieldClauseDescriptor) o;
            return Objects.equals(fieldName, that.fieldName) &&
                    Objects.equals(alias, that.alias);
        }
    
        @Override
        public int hashCode() {
            return Objects.hash(fieldName, alias);
        }
    
        @Override
        public String toString() {
            return "SelectFieldClauseDescriptor{" +
                    "fieldName='" + fieldName + '\'' +
                    ", alias='" + alias + '\'' +
                    '}';
        }
    }

      它存在的意義,僅僅是為了使用方更方便取值,以為更進一步的解析提供了依據。

     

    3. 單元測試

      其實像寫這種工具類,單元測試最是方便簡單。因為最初的結果,我們早已預料,以測試驅動開發最合適不過了。而且,基本上一出現不符合預期的值時,很快速就定位問題了。

    /**
     * 功能描述: sql字段解析器測試
     **/
    public class SimpleSqlFieldParserTest {
    
        @Test
        public void testParse() {
            String sql;
            List<SelectFieldClauseDescriptor> parsedFieldList;
            sql = "select COALESCE(t1.xno, t2.xno, t3.xno) as xno,\n" +
                    "   case when t1.xno is not null then 1 else null end as xxk001,\n" +
                    "   case when t2.xno is not null then 1 else null end as xxk200,\n" +
                    "   case when t3.xno is not null then 1 else null end as xx3200\n" +
                    "   from xxk001 t1\n" +
                    "     full join xxkj100 t2 on t1.xno = t2.xno\n" +
                    "     full join xxkj200 t3 on t1.xno = t3.xno;";
            parsedFieldList = SimpleSqlFieldParser.parse(sql);
            System.out.println("result:");
            parsedFieldList.forEach(System.out::println);
            Assert.assertEquals("字段個數解析不正確",
                    4, parsedFieldList.size());
            Assert.assertEquals("字段別名解析不正確",
                    "xno", parsedFieldList.get(0).getAlias());
            Assert.assertEquals("字段別名解析不正確",
                    "xx3200", parsedFieldList.get(3).getAlias());
    
            sql = "select cast(a as string) as b from ccc;";
            parsedFieldList = SimpleSqlFieldParser.parse(sql);
            System.out.println("result:");
            parsedFieldList.forEach(System.out::println);
            Assert.assertEquals("字段個數解析不正確",
                    1, parsedFieldList.size());
            Assert.assertEquals("字段別名解析不正確",
                    "b", parsedFieldList.get(0).getAlias());
    
            sql = "with a as(select cus,x1 from b1), b as (select cus,x2 from b2)\n" +
                    "    select a.cus as a_cus, cast(a \nas string) as a_cus2, " +
                    "b.x2 b2 from a join b on a.cus=b.cus where xxx;";
            parsedFieldList = SimpleSqlFieldParser.parse(sql);
            System.out.println("result:");
            parsedFieldList.forEach(System.out::println);
            Assert.assertEquals("字段個數解析不正確",
                    3, parsedFieldList.size());
            Assert.assertEquals("字段別名解析不正確",
                    "a_cus", parsedFieldList.get(0).getAlias());
            Assert.assertEquals("字段別名解析不正確",
                    "b2", parsedFieldList.get(2).getAlias());
    
            sql = "select a.xno,b.xx,qqq from a_tb as a join b_tb as b on a.id = b.id";
            parsedFieldList = SimpleSqlFieldParser.parse(sql);
            System.out.println("result:");
            parsedFieldList.forEach(System.out::println);
            Assert.assertEquals("字段個數解析不正確",
                    3, parsedFieldList.size());
            Assert.assertEquals("字段別名解析不正確",
                    "xno", parsedFieldList.get(0).getAlias());
            Assert.assertEquals("字段別名解析不正確",
                    "qqq", parsedFieldList.get(2).getAlias());
    
            sql = "select cast (a.a_int as string) a_str, b.xx, coalesce  \n( a, b, c) qqq from a_tb as a join b_tb as b on a.id = b.id";
            parsedFieldList = SimpleSqlFieldParser.parse(sql);
            System.out.println("result:");
            parsedFieldList.forEach(System.out::println);
            Assert.assertEquals("字段個數解析不正確",
                    3, parsedFieldList.size());
            Assert.assertEquals("字段別名解析不正確",
                    "a_str", parsedFieldList.get(0).getAlias());
            Assert.assertEquals("字段原始名解析不正確",
                    "cast (a.a_int as string)", parsedFieldList.get(0).getFieldName());
            Assert.assertEquals("字段別名解析不正確",
                    "qqq", parsedFieldList.get(2).getAlias());
            Assert.assertEquals("字段原始名解析不正確",
                    "coalesce  \n( a, b, c)", parsedFieldList.get(2).getFieldName());
        }
    }

      至此,一個簡單的字段解析器完成。小工具,供參考!

     

    posted @ 2021-06-21 09:58  等你歸去來  閱讀(965)  評論(4編輯  收藏  舉報
    国产美女a做受大片观看