索引，妙不可言

王硕 · 发表于 2015-4-6 23:03:49

5.2.1.6 巧用三特征之存储列值

1.count(*)优化

drop table t purge;

create table t as select * from dba_objects;

create index idx1_object_id on t(object_id);

select count(*) from t;

count(*)在索引列有空值时无法用到索引：

set autotrace on

set linesize 1000

set liming on

select count(*) from t;

因为，索引不能存储空记录。如果索引列有空记录，那依据索引来统计表的记录数，不会用到索引。

明确索引列为空(只查询非空列），即可让count（*）用到索引：

set autotrace on

set linesize 10

set liming on

select count(*) from t where object_id is not null;

查看T表的列是否为空：

desc t；

由于T表是create建立起来，所以当前列的属性全部允许为空。

当把列的属性改为非空就可以直接用到索引。主键也必须走索引。

更改object_id的属性，修改为不允许为空：

alter table t modify object_id not null;

之后发现count（*）用到索引。

由于，索引含有列值，如果一个表只有一个列，则走索引会更慢。

2.SUM/AVG优化

drop table t purge;

create table t as select * from dba_objects;

create index idx1_object_id on t(object_id);

SUM/AVG用不到索引（因为列允许为空）

set autotrace on

set linesize 1000

set liming on

select sum(object_id) from t;

SUM、AVG、COUNT综合写法试验：

select sum(object_id),avg(object_id),count(*) from t where object_id is not null;

三个聚合语句在一起，只需一次扫描索引块，就同时解决三个问题，同单个问题所用时间一样。

3.MAX/MIN优化

select max(object_id) from t;

max/min 无论列是否为空都可以用索引（也可能不用）。

更由于max索引之走索引块最右边最后一个块，所以逻辑读极少（无论表的大小）。

min和max同时写的优化（空值导致用不到索引）：

set autotrace on

set linesize 1000

set min(object_id),max(object_id) from t;

增加is not null,同时写min、max的优化（无法用index full scan（min/max））：

select min（object_id),max(object_id) from t where object_id is not null;

虽然走了索引，扫描方式却是 index fast full scan 而非index full scan(min/max)

4.索引回表读（table access by index rowid)

drop table t purge;

create table t as select * from dba_objects;

create index idx1_object_id on t(object_id);

set autotrace traceonly

set linesize 1000

set liming on

select * from t where object_id<=5；

* 使得扫描索引定位索引块后，重新定位整个t表

如果返回列与索引列不同：

select object_id,object_name from t where object_id<=5;

建立组合索引消除回表：

create index idx_un_objid_objname on t(object_id,object_name);

聚合因子，建有序和无序的表各一张：

drop table t_colocated purge;

create table t_colocated (id number,col2 varchar2(100));

begin

for i in 1..100000

loop

insert into t_colocated(id,col2)

values(i,rpad(dbms_random.random,95,'*'));

end loop;

end;

/

alter table t_colocated add constraint pk_t_colocated primary key(id);

drop table t_disorganized purge;

create table t_disorganized

as

select id,col2

from t_colocated

order by col2;

alter table t_disorganized add constraint pk_t_disorg primary key(id);

t_colocated 中 1到100000顺序插入,聚合因子比较低。

t_disorganized随机插入，聚合因子比较高

分析聚合因子：

set linesize 1000

select index_name,

blevel,

leaf_blocks,

num_rows,

distinct_keys,

clustering_factor

from user_ind_statistics

where table_name in('T_COLOCATED','T_DISORGANIZED');

其中CLUSTERING_FACTOR：表明有多少临近的索引条目指到不同的数据块。

[此贴子已经被作者于2015-06-28 01:30:13编辑过]

王硕 · 发表于 2015-4-6 23:03:19

5.2.1.5妙用三特征之高低较低

drop table t1 purge；

drop table t2 purge;

drop table t3 purge;

drop table t4 purge;

drop table t5 purge;

drop table t6 purge;

drop table t7 purge;

create table t1 as select rownum as id,rownum+1 as id2 from dual connect by level<=5;

create table t2 as select rownum as id,rownum+1 as id2 from dual connect by level<=50;

create table t3 as select rownum as id,rownum+1 as id2 from dual connect by level<=500;

create table t4 as select rownum as id,rownum+1 as id2 from dual connect by level<=5000;

create table t5 as select rownum as id,rownum+1 as id2 from dual connect by level<=50000;

create table t6 as select rownum as id,rownum+1 as id2 from dual connect by level<=500000;

create table t7 as select rownum as id,rownum+1 as id2 from dual connect by level<=5000000;

create index idx_id_t1 on t1(id);

create index idx_id_t2 on t2(id);

create index idx_id_t3 on t3(id);

create index idx_id_t4 on t4(id);

create index idx_id_t5 on t5(id);

create index idx_id_t6 on t6(id);

create index idx_id_t7 on t7(id);

观察比较各个索引的大小：

select segment_name,bytes/1024

from user_segments

where segment_name in ('IDX_ID_T1','IDX_ID_T2','IDX_ID_T3','IDX_ID_T4','IDX_ID_T5','IDX_ID_T6','IDX_ID_T7');

观察比较各个索引的高度：

select index_name,blevel,leaf_blocks,num_rows,distinct_keys,clustering_factor

from user_ind_statistics where table_name in ('T1','T2','T3','T4','T5','T6','T7');

由于一个区含有8个块，一个块8KB，即便是一个空表，大小也是64KB。

如果删除了索引，再执行查看查询性能。

5.2.1.5.1分区索引设计误区

建立分区表part_tab ,插入数据，并分别在col2上建局部索引，在col3上建立全局索引：

drop table part_tab purge;

create table part_tab (id int,col2 int,col3 int)

partition by range(id)

(

partition p1 values less than (10000),

partition p2 values less than (20000),

partition p3 values less than (30000),

partition p4 values less than (40000),

partition p5 values less than (50000),

partition p6 values less than (60000),

partition p7 values less than (70000),

partition p8 values less than (80000),

partition p9 values less than (90000),

partition p10 values less than (100000),

partition p11 values less than (maxvalue)

)

;

insert into part_tab select rownum,rownum+1,rownum+2 from dual connect by rownum<=110000;

commit;

create index idx_par_tab_col2 on part_tab(col2) local;

create index idx_par_tab_col3 on part_tab(col3);

查看分区表各个分区及其类型：

col segment_name format a20

select segment_name,partition_name,segment_type

from user_segments

where segment_name='PART_TAB';

select segment_name,partition_name,segment_type

from user_segments

where segment_name='IDX_PAR_TAB_COL2';

select segment_name,partition_name,segment_type

from user_segments

where segment_name='IDX_PAR_TAB_COL3';

建立一个普通表norm_tab，与part_tab类似：

drop table norm_tab purge;

create table norm_tab（id int,col2 int,col3 int);

insert into norm_tab select rownum,rownum+1,rownum+2 from dual connect by rownum<=110000;

commit;

create index idx_nor_tab_col2 on norm_tab(col2);

create index idx_nor_tab_col3 on norm_tab(col3);

查询性能分析：

set autotrace traceonly

set linesize 1000

set timing on

select * from part_tab where col2=8;

select * from norm_tab where col3=8;

针对分区表查询逻辑读为普通表的6倍，cost也为6倍。

分区表索引IO个数要乘以分区个数，分区查询，导致查询变慢。

分别观察分区表和普通表的索引高度：

select index_name,

blevel,

leaf_blocks,

num_rows,

distinct_keys,

clustering_factor

from user_ind_partitions

where index_name='IDX_PAR_TAB_COL2';

select index_name,

blevel,

leaf_blocks,

num_rows,

distinct_keys,

clustering_factor

from user_ind_statistics

where index_name='IDX_PAR_TAB_COL2';

分区索引扫描落在某一个分区，性能大幅提升：

select * from part_tab where col2=8 and id=7;

这里pstart 与pstop 只遍历一个分区。

[此贴子已经被作者于2015-06-28 01:06:17编辑过]

帐号		自动登录	找回密码
密码			注册