ITPub博客

首页 > 数据库 > PostgreSQL > PostgreSQL 源码解读(168)- 查询#88(PG中的词法定义:scanner.l)#1

PostgreSQL 源码解读(168)- 查询#88(PG中的词法定义:scanner.l)#1

原创 PostgreSQL 作者:husthxd 时间:2019-04-15 15:23:06 0 删除 编辑

输入一条SQL语句,PostgreSQL如何解析输入的SQL,识别SQL类型以及基表/字段等信息?接下来的几节将逐一进行解析.
本节介绍了PostgreSQL的词法定义文件(Flex输入文件),在文件src/backend/parser/scan.l中.
如前所述,Flex输入文件由四部分组成:


%{
Declarations
%}
Definitions
%%
Rules
%%
User subroutines

本节介绍第一部分Declarations

一、Declarations

由%{和%}包含的部分为Declarations部分,这一部分都是C代码,会原封不动的copy到lex.yy.c文件中.
比较重要的定义包括:
YYSTYPE-Bison使用一个union联合体来存储所有可能类型的值,全局变量yyvalue的类型是YYSTYPE.


%top{
/*-------------------------------------------------------------------------
 *
 * scan.l
 *    lexical scanner for PostgreSQL
 *    PostgreSQL的词法扫描器  
 *
 * NOTE NOTE NOTE:
 * 特别特别特别注意:
 * The rules in this file must be kept in sync with src/fe_utils/psqlscan.l!
 * 这个文件中的规则必须与src/fe_utils/psqlscan.l文件中的规则保持一致!!! 
 *
 * The rules are designed so that the scanner never has to backtrack,
 * in the sense that there is always a rule that can match the input
 * consumed so far (the rule action may internally throw back some input
 * with yyless(), however).  As explained in the flex manual, this makes
 * for a useful speed increase --- about a third faster than a plain -CF
 * lexer, in simple testing.  The extra complexity is mostly in the rules
 * for handling float numbers and continued string literals.  If you change
 * the lexical rules, verify that you haven't broken the no-backtrack
 * property by running flex with the "-b" option and checking that the
 * resulting "lex.backup" file says that no backing up is needed.  (As of
 * Postgres 9.2, this check is made automatically by the Makefile.)
 * 之所以设计这一的规则是便于扫描器不需要回溯,确保对于输入一定有一条规则与其匹配
 * (但是,规则动作可能在内部用yyless() throw back一些输入).
 * 正如Flex手册中所说明的,这可以提升性能 -- 
 *   在简单测试的情况下,相对于普通的-CF词法分析器,大概有1/3的性能提升.
 * 额外的复杂性主要体现在处理浮点数和连续字符串文字的规则中.
 * 如果修改了词法规则,通过以-b选项执行Flex以确保没有打破无回溯的约定,
 *   并且坚持结果文件"lex.backup"以确认无需备份.
 * (在PG 9.2,该检查通过Makefile自动执行)
 *
 * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
 *    src/backend/parser/scan.l
 *
 *-------------------------------------------------------------------------
 */
#include "postgres.h"
#include <ctype.h>
#include <unistd.h>
#include "common/string.h"
#include "parser/gramparse.h"
#include "parser/parser.h"      /* only needed for GUC variables */
#include "parser/scansup.h"
#include "mb/pg_wchar.h"
}
//------------------ 声明部分
%{
/* LCOV_EXCL_START */
/* Avoid exit() on fatal scanner errors (a bit ugly -- see yy_fatal_error) */
//在扫描器出现致命错误时,避免调用exit()直接退出
#undef fprintf
#define fprintf(file, fmt, msg)  fprintf_to_ereport(fmt, msg)
static void
fprintf_to_ereport(const char *fmt, const char *msg)
{
    ereport(ERROR, (errmsg_internal("%s", msg)));
}
/*
 * GUC variables.  This is a DIRECT violation of the warning given at the
 * head of gram.y, ie flex/bison code must not depend on any GUC variables;
 * as such, changing their values can induce very unintuitive behavior.
 * But we shall have to live with it until we can remove these variables.
 * GUC参数变量.这直接违反了gram.y中提出的约定,如flex/bison代码不能依赖GUC变量;
 * 因此,改变他们的值会导致未知的后果.
 * 但在去掉这些变量前,不得不"活下去"
 */
int         backslash_quote = BACKSLASH_QUOTE_SAFE_ENCODING;
bool        escape_string_warning = true;
bool        standard_conforming_strings = true;
/*
 * Set the type of YYSTYPE.
 * 设置YYSTYPE.
 * 在Bison中,全局变量yylval的类型为YYSTYPE,默认为int
 * Internally, bison declares each value as a C union that includes all of the types. 
 * You list all of the types in %union declarations. 
 * Bison turns them into a typedef for a union type called YYSTYPE.
 */
#define YYSTYPE core_YYSTYPE
/*
 * Set the type of yyextra.  All state variables used by the scanner should
 * be in yyextra, *not* statically allocated.
 * 设置yyextra的数据类型.所有扫描器使用的状态变量应在yyextra中,不是静态分配的.
 */
#define YY_EXTRA_TYPE core_yy_extra_type *
/*
 * Each call to yylex must set yylloc to the location of the found token
 * (expressed as a byte offset from the start of the input text).
 * When we parse a token that requires multiple lexer rules to process,
 * this should be done in the first such rule, else yylloc will point
 * into the middle of the token.
 * 每一次调用yylex必须设置yylloc指向发现的token所在的位置.
 * (从输入文本开始计算的字节偏移量)
 * 在分析一个需要多个词法规则进行处理的token时,
 *   在第一次应用规则时就应该完成这个动作,否则的话yylloc会指向到token的中间位置.
 */
#define SET_YYLLOC()  (*(yylloc) = yytext - yyextra->scanbuf)
/*
 * Advance yylloc by the given number of bytes.
 * 通过给定的字节数调整yylloc的位置
 */
#define ADVANCE_YYLLOC(delta)  ( *(yylloc) += (delta) )
#define startlit()  ( yyextra->literallen = 0 )
static void addlit(char *ytext, int yleng, core_yyscan_t yyscanner);
static void addlitchar(unsigned char ychar, core_yyscan_t yyscanner);
static char *litbufdup(core_yyscan_t yyscanner);
static char *litbuf_udeescape(unsigned char escape, core_yyscan_t yyscanner);
static unsigned char unescape_single_char(unsigned char c, core_yyscan_t yyscanner);
static int  process_integer_literal(const char *token, YYSTYPE *lval);
static bool is_utf16_surrogate_first(pg_wchar c);
static bool is_utf16_surrogate_second(pg_wchar c);
static pg_wchar surrogate_pair_to_codepoint(pg_wchar first, pg_wchar second);
static void addunicode(pg_wchar c, yyscan_t yyscanner);
static bool check_uescapechar(unsigned char escape);
#define yyerror(msg)  scanner_yyerror(msg, yyscanner)
#define lexer_errposition()  scanner_errposition(*(yylloc), yyscanner)
static void check_string_escape_warning(unsigned char ychar, core_yyscan_t yyscanner);
static void check_escape_warning(core_yyscan_t yyscanner);
/*
 * Work around a bug in flex 2.5.35: it emits a couple of functions that
 * it forgets to emit declarations for.  Since we use -Wmissing-prototypes,
 * this would cause warnings.  Providing our own declarations should be
 * harmless even when the bug gets fixed.
 * Flex 2.5.35存在一个bug:忽略了函数但没有忽略函数声明.
 * 因为使用了-Wmissing-prototypes选项,这会导致警告出现.
 * 就算bug修复,提供PG的声明也可能会存在问题.
 */
extern int  core_yyget_column(yyscan_t yyscanner);
extern void core_yyset_column(int column_no, yyscan_t yyscanner);
%}

二、参考资料

Flex&Bison

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/6906/viewspace-2641424/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
长期从事政务、金融等行业产品研发和架构设计工作,ITPUB数据库版块资深版主,对Oracle、PostgreSQL以及大数据等相关技术有深入研究。现就职于广州云图数据技术有限公司,系统架构师。

注册时间:2007-12-28

  • 博文量
    1233
  • 访问量
    3703909