ITPub博客

首页 > 数据库 > Oracle > oracle全文索引之STOPLIST_ CTXCAT 索引_INDEX SET

oracle全文索引之STOPLIST_ CTXCAT 索引_INDEX SET

原创 Oracle 作者:jolly10 时间:2009-05-27 10:26:47 0 删除 编辑
这篇文章介绍的是ORACLE CTXCAT索引的INDEX SET,也是CTXCAT索引特有的属性。[@more@]
CTXCAT 索引是CONTEXT索引的简化版,CTXCAT索引支持的PREFERENCE包括:LEXER、STOPLIST、WORDLIST和STORAGE 参数。不支持其他的参数如:DATASTORE、FILTER、SECTION GROUP。虽然支持LEXER但不支持THEME查询,而且不支持FORMAT、CHARSET和LANGUAGE列,另外不支持表和索引分区。

CTXCAT 索引仅仅包含了CONTEXT索引的部分内容,但是CTXCAT索引有其自身的优点。其中最突出的优点就是支持DML同步。CONTEXT索引由于结构过于复杂,且索引的数据量一般较大,因此CONTEXT索引并不是自动同步的。而CTXCAT索引是自动同步的,当发生了DML修改时,Oracle会自动同步CTXCAT索引,降低了索引的维护成本。

CTXCAT 索引的另外一个优点就是这里要介绍的INDEX SET属性,这也是CTXCAT索引特有的属性。简单的说,CTXCAT可以建立一个索引集。可以把一些经常与CTXCAT查询组合使用的查询列的索引添加到索引集中。比如,如果在查询文章内容的同时,经常需要查询文章的作者、标题或创建时间等信息,则可以将这些信息列的索引添加到索引集中,Oracle 可以将这些查询封装到CATSEARCH操作中,从而提高全文索引的效率。

下面看一个简单的例子:

[oracle@rhel140 ~]$ sqlplus myuser/myuser

SQL*Plus: Release 10.2.0.1.0 - Production on Sun May 24 23:19:47 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> CREATE TABLE T (ID NUMBER, TITLE VARCHAR2(256), CREATED DATE, DOCS VARCHAR2(4000));

Table created.

SQL> INSERT INTO T VALUES (1, 'ORACLE TEXT REFERENCE', TO_DATE('200203', 'YYYYMM'),
2 'This manual provides reference information for Oracle Text. Use it as a reference for
3 creating Oracle Text indexes, for issuing Oracle Text queries, for presenting
4 documents, and for using the Oracle Text PL/SQL packages.');

1 row created.

SQL> INSERT INTO T VALUES (2, 'ORACLE TEXT APPLICATION DEVELOPER''S GUIDE',
2 TO_DATE('200203', 'YYYYMM'),
3 'This guide explains how to build query applications with Oracle Text. This preface
4 contains these topics:');

1 row created.

SQL> INSERT INTO T VALUES (3, 'ORACLE SQL REFERENCE', TO_DATE('200210', 'YYYYMM'),
2 'This reference contains a complete description of the Structured Query Language
3 (SQL) used to manage information in an Oracle database. Oracle SQL is a superset
4 of the American National Standards Institute (ANSI) and the International
5 Standards Organization (ISO) SQL99 standard.');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT;

Index created.

SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;

ID
----------
1
3


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2036 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


SQL> CREATE INDEX IND_T_CREATED ON T (CREATED);

Index created.

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 2040116302

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 4 (25)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 2036 | 4 (25)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
| 3 | BITMAP AND | | | | | |
| 4 | BITMAP CONVERSION FROM ROWIDS| | | | | |
|* 5 | INDEX RANGE SCAN | IND_T_CREATED | 2 | | 1 (0)| 00:00:01 |
| 6 | BITMAP CONVERSION FROM ROWIDS| | | | | |
| 7 | SORT ORDER BY | | | | | |
|* 8 | DOMAIN INDEX | IND_T_DOCS | 2 | | | |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
8 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)

Note
-----
- dynamic sampling used for this statement


上面的查询不仅包括了对DOCS列的全文索引查询还包括了对CREATED列的查询。Oracle采用了先将BTREE索引转化为BITMAP索引,在进行BITMAP索引的AND的方法进行了处理。而如果采用了INDEX SET的方法,则不需要如此复杂的转化:

SQL> DROP INDEX IND_T_DOCS;

Index dropped.

SQL> DROP INDEX IND_T_CREATED;

Index dropped.


SQL> BEGIN
2 CTX_DDL.CREATE_INDEX_SET('TEST_INDEXSET');
3 CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'CREATED');
4 END;
5 /

PL/SQL procedure successfully completed.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');

Index created.

SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED = TO_DATE(''200203'', ''YYYYMM'')') > 0;

ID
----------
1


Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL','CREATED = TO_DATE(''200203'',
''YYYYMM'')')>0)

Note
-----
- dynamic sampling used for this statement


处理支持结构性查询条件外,使用INDEX SET索引集还可以支持索引列的排序操作

SQL> SET AUTOT OFF
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;

ID
----------
1
3

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'ORDER BY CREATED DESC') > 0;

ID
----------
3
1

这种将索引添加到索引集的操作也有一定的限制,比如结构性查询的操作只能包含大于、等于、小于、IN、BETWEEN AND操作。另外,加入到索引集中索引列的大小也有限制,对于VARCHAR2和CHAR类型,大小不能超过30字符,否则会报错:

用了不等于操作

SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0;
SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0
*
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-10844: index cannot execute this structured predicate efficiently
DRG-10845: column CREATED ! is not indexed


建立超过30个字符的索引

SQL> DROP INDEX IND_T_DOCS;

Index dropped.

SQL> exec CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'TITLE');

PL/SQL procedure successfully completed.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');
CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
*
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-12306: column TITLE is too long for index set columns
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CATINDEXMETHODS", line 97


SQL> select id,length(title) from t;

ID LENGTH(TITLE)
---------- -------------
1 21
2 41
3 20


SQL> update t set title=substr(title,1,30) where id=2;
update t set title=substr(title,1,30) where id=2
*
ERROR at line 1:
ORA-29861: domain index is marked LOADING/FAILED/UNUSABLE


SQL> drop index IND_T_DOCS;

Index dropped.

SQL> update t set title=substr(title,1,30) where id=2;

1 row updated.

SQL> commit;

Commit complete.

SQL> ALTER TABLE MYUSER.T MODIFY(TITLE VARCHAR2(30 BYTE));

Table altered.

SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');

Index created.

将字段长度改为varchar2(30)则可以建立

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/271283/viewspace-1022505/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2008-02-20

  • 博文量
    263
  • 访问量
    769119