索引，妙不可言

王硕 · 发表于 2015-4-6 23:03:00

第五章：索引天地妙不可言

1.btree索引

1.1 高度较低

1.2存储列值和 rowid

1.3结构有序

2.位图索引

3.函数索引

索引，由Root（根块），Branch（茎块），Leaf（叶子块）组成。

其中，Leaf（叶子块)主要存储了 key column value （索引列具体值）

以及能具体定位到数据所在位置的rowid。

[此贴子已经被作者于2015-04-06 23:09:39编辑过]

王硕 · 发表于 2015-4-6 23:05:36

5.2.3函数索引

5.2.3.1列运算让索引失去作用

drop table t purge;

create table t as select * from dba_objects;

create index idx_object_id on t(object_id);

create index idx_object_name on t(object_name);

create index idx_created on t(created);

select count(*) from t;

set autotrace traceonly

set linesize 1000

select * from t where upper(object_name)='T';

全表扫描逻辑读772

去掉列upper操作，则使用索引:

select * from t where object_name='T';

如果object_name 不存在小写字母。upper操作不影响结果。但是一般是大写的就不用upper查询了。

5.2.3.2正确使用函数索引：

用函数运算代替列名：

create index idx_upper_obj_name on t(upper(object_name));

select * from t where upper(object_name)='T';

逻辑读与普通索引一样，代价确实113，远大于普通索引的2，只是比全表扫描好，全表索引逻辑读为772，代价为175；

查看索引类型：

select index_name,index_type from user_indexes where table_name='T';

类型为FUNCTION-BASED NORMAL

5.2.3.3避免列运算

经典案例1：

set autotrace traceonly

set linesize 1000

select * from t where object_id-10<=30;

select * from t where object_id<=40;

要同时用列运算，又保证高效：

create index idx_object_id_2 on t(object_id-10);

select * from t where object_id-10<=30;

2.经典案例2

错误示范：

select * from t where substr(object_name,1,4)='CLUS';

select * from t where substr(object_name,1,5)='CLUST';

select * from t where substr(object_name，1,6）='CLUSTE';

正确示范：

select * from t where object_name like 'CLUS%';

select * from t where object_name like 'CLUST%';

select * from t where object_name like 'CLUSTE%';

3.经典案例3

错误示范：（找出两天的数据）

select *

from t

where trunc(created)>=TO_DATE('2012-10-02','YYYY-MM-DD'）

and trunc(created)<=TO_DATE('2012-10-03','YYYY-MM-DD');

只有建一个trunc相关的函数索引，否则用不上索引。

但是建立索引就有弊端,所以可以优化sql：

select *

from t

where created>=TO_DATA('2012-10-02','YYYY-MM-DD'）

and created<TO_DATE('2012-10-03','YYYY-MM-DD')+1

只要把<=改为<后面加个1 ，就可以不用把create去整了。

王硕 · 发表于 2015-4-6 23:05:24

5.2.2.4位图索引重复度低

位图索引：第一位图索引列重复列低；第二该表极少更新。

在object_id列建立为图索引，count（*）性能比较：

drop table t purge;

create table t as select * from dba_objec；

insert into t select * from t;

update t set object_id=rownum;

commit;

create bitmap index idx_bit_object_id on t(object_id);

select count(*) from t;

代价为10739 全表扫描

强制使用索引：

select /*+index(t,idx_bit_object_id)*/ count(*) from t;

代价12987，比全表扫描10739还高。

结论：btree索引存储的是列值，而位图索引存储的是比特位值。

一个生产库中，如果同一个列重复度低，但是更新频繁，就不能加位图索引。

王硕 · 发表于 2015-4-6 23:05:14

5.2.2.3位图索引不适合更新操作

建立第一个链接：

sqlplus ljb/ljb

select sid from v$mystat where rownum=1;

insert into t(name_id,gender,location,age_group,date) values(100001,'M',45,'child',rpad('*',20,'*'));

建立第二个链接：（不能插入）

sqlplus ljb/ljb

select sid from v$mystat where rownum=1;

insert into t(name_id,gender,location,age_group,date) values(100002,'M',46,'young',rpad('*',20,'*'));

建立第三个链接：（又能成功插入）

sqlplus ljb/ljb

select sid from v$mystat where rownum=1;

insert into t(name_id,gender,location,age_group,date) values(100003,'F',47,'Middle_age',rpad('*',20,'*'));

建立第四个链接：（不能插入）

sqlplus ljb/ljb

select sid from v$mystat where rownum=1;

insert into t(name_id,gender,location,age_group,date) values(100003,'F',48,'old',rpad('*',20,'*'));

第一次插入gender=’M'的记录，未提交前，再有新的session插入gender='M'的记录就会卡住，而插入gender='F'的记录就可以。不支持高并发。

删除location，age_group位图索引：

rollback；（4个会话）

drop index location_idx;

drop index age_group_idx;

测试锁：

第一个会话：

delete from t where gender='M' and location=25;

第二个会话：（插入带M的记录被阻挡，一下语句3个语句都会被阻止）

insert into t (name_id,gender,location,age_group,date) values(100001,'M',78,'young','TTT');

update t ser gender='M' where location=25;

delete from t where gender='M';

(以下不进行阻止，update 只要不更新位图索引所在的列即可）：

insert into t (name_id,gender,location,age_group,date) values

(100001,'M',78,'young','TTT');

delete from t where gender='F';

update t set location=100 where rowid not in (select rowid from t where gender='F' and location=25);

王硕 · 发表于 2015-4-6 23:05:02

5.2.2.2即席查询（多维度报表查询）

drop table t purge;

create table t

(name_id,

gender not null,

location not null,

age_group not null,

date

)

as

select rownum,

decode(ceil(dbms_random.values(0.2)),

1,'M',

2,'F')gender,

ceil(dbms_random.values(1,50)) location,

decode(ceil(dbms_random.values(0,3)),

1,'child',

2,'young',

3,'middle_age',

4,'old'),

rpad('*',20,'*')

from dual

在没有建立索引下：

set linesize 1000

set autotrace traceonly

select *

from t

where gender='M'

and location in (1,10,30)

and age_group='child';

cost为124

即席查询中，oracle不选择组合索引:

create idex idx_union on t(gender,location,age_group);

select *

from t

where gende='M'

and location in (1,10,30)

and age_group='child';

依然是全表扫描，cost为124

强制即席查询使用组合索引，情况更糟：

select /*+index(t,idx_union)*/ *

from t

where gender='M'

and location in (1,10,30)

and age_group='child';

代价为59960

在gender，location，age_group三个字段分别建立位图索引：

create bitmap index gender_idx on t(gender);

create birmap index location_idx on t(location);

create bitmap index age_group_idx on t(age_group);

select *

from t

where gender='M'

and location in (1,10,30)

and age_group='41 and over';

代价11。

王硕 · 发表于 2015-4-6 23:04:50

5.2.2.1统计条数

drop table t purge;

create table t as select * from dba_object；

insert into t select * from t;

update t set object_id=rownum;

commit;

set autotrace on

set linesize 1000

select count(*) from t;

在object_id 列建立索引，并设置该属性为非空，测试count（*）的执行计划：

create index idx_t_obj on t(object_id);

alter table T modify object_id not null;

set autotrace on

select count(*) from t;

对t表的STATUS列建一个位图索引：

create bitmap index idx_bitm_t_status on t(status);

select count(*) from t;

第一次查询无索引，全表扫描，代价10733，逻辑读48871；

第二次在object_id建立索引，通过索引快速全部扫描，代价1874，逻辑读8330；

第三次在status加位图索引，代价88，逻辑读95。

结论：count（*）的性能，在非空有btree索引下，一般用到该索引的性能远高于全表扫描。不过性能更高的却是有位图的索引。而已位图索引不需要考虑空值问题。

王硕 · 发表于 2015-4-6 23:04:39

5.2.1.11合理控制索引数量

对需要跟踪的索引进行监控：

alter index 索引名 monitoring usage;

通过观察v$object_usage进行跟踪：(未监控索引时，v$object_usage不存在任何记录)

select * from b$object_usage;

drop table t purge;

create table t as select * from dba_objects;

create index idx_t_id on t(object_id);

create index idx_t_name on t (object_name);

select * from v$object_name

alter index idx_t_id monitoring usage;

alter index idx_t_name monitoring usage;

set linesize 166

col INDEX_NAME for a10

col TABLE_NAME for a10

col MONITORING for a10

col USED for a10

col START_MONITORING for a25

col END_MONITORING for a25

select * from v$object_usage;

USED =NO 表示从监控以来没被使用过

查询必使用object_id 列的索引：

select object_id from t where object_id =19;

select * from v$object_usage;

解除监控：

alter index idx_t_id nomonitoring usage;

王硕 · 发表于 2015-4-6 23:04:27

5.2.1.10索引的危害

1.索引越多插入明显慢得多

drop table t_no_idx purge;

drop table t_1_idx purge;

drop table t_2_idx purge;

drop table t_3_idx purge;

drop table t_n_idx purge;

create table t_no_idx as select * from dba_objects;

insert into t_no_idx select * from t_no_idx;

commit;

select count(*) from t_no_idx;

create table t_1_idx as select * from t_no_idx;

create index idx_1_1 on t_1_idx(object_id);

create table t_2_idx as select * from t_no_idx;

create index idx_2_1 on t_2_idx(object_id);

create index idx_2_2 on t_2_idx(object_name);

create table t_3_idx as select * from t_no_idx;

create index idx_3_1 on t_3_idx(object_id);

create index idx_3_2 on t_3_idx(object_name);

create index idx_3_3 on t_3_idx(object_type);

每张表记录都是1781312条。

set timing on

insert into t_no_idx select * from t_no_idx where rownum<=100000;

insert into t_1_idx select * from t_1_idx where rownum<=100000;

commit;

insert into t_2_idx select * from t_2_idx where rownum<=100000;

insert into t_3_idx select * from t_3_idx where rownum<=100000;

结果没有索引，插入最快，索引越多，插入越慢。更新记录就更新了索引，就要维护索引那种有序排列的结构，使得开销变大。

2.无序插入索引影响更大

set timing on

insert into t_no_idx select * from t_no_idx where rownum<=100000 order by dbms_random.random ;

commit;

insert into t_1_idx select * from t_1_idx where rownum<=100000 order by dbms_random.random ;

insert into t_2_idx select * from t_1_idx where rownum<=100000 order by dbms_random.random ;

insert into t_3_idx select * from t_3_idx where rownum<=100000 order by dbms_random.random ;

order by dbms_random.random 表示无序插入，会重组原来的索引，重新定义，所以变慢。

优化：如果需要插入的数据过大，可使表的索引失效，等插入完毕再生效索引。

3.delete对性能影响

对于DELETE语句，在海量数据库定位删除少数记录是，这个条件列很必要。但是过多的列有索引会影响明显，因为其他列的索引也会被更新，在经常删除大量数据的时候，危害加剧。

delete删除索引列后，索引块中的相关需删除记录只是被打上一个删除标志而已，其实并没有真正删除。

4.建索引动作引发排序及锁

在建立索引是需要把当前索引列的列植取出来，排序后依次插入快中形成索引块，加上锁为了避免此时列植被更新，导致顺序又变化了，影响建索引的工作。

要避免在库繁忙时建立索引。

王硕 · 发表于 2015-4-6 23:04:14

5.2.1.9组合索引高效设计要领

1.组织索引能在适当的场合避免回表，动作为：table access by index rowid

2.例如

select count(*) from t where a=1;

select count(*) from t where b=2;

select count(*) from t where a=1 and b=2;

如果a=1 and b=2 的返回比前面单独a=1 或者b=2 的返回值少很多，在a ，b两个列建立组合索引就会高效很多，但不宜超过3个字段组合。

3.

（1）等值查询情况效率一样

drop table t purge;

create table t as select * from dba_objects;

create index idx1_object_id on t(object_id,object_type);

create index idx2_object_id on t(object_type,object_id);

set autotrace traceonly

set linesize 1000

select /*+index(t,idx1_object_id)*/ * from t where object_id=20 and object_type='TABLE';

select /*+index(t,idx2_object_id)*/ * from t where object_id=20 and object_type='TABLE';

consistent gets ,cost 代价相同，所以在等值查询下，组合索引的列无论哪一列在前，性能都一样。

（2）范围查询情况差异明显

set autotrace traceonly

set linesize 1000

select /*+index(t,idx1_object_id)*/ * from t where object_id>=20 and object_id<2000 and object_type='TABLE';

select /*+index(t,idx2_object_id)*/ * from t where object_id>=20 and object_id<2000 and object_type='TABLE';

逻辑读第一个为60，第二个为55,所以在组合索引中，当一列是范围查询，一列是等值查询的情况下，等值在前，范围在后更高效

王硕 · 发表于 2015-4-6 23:04:02

5.2.1.7活用索引有序

1.order by 排序优化

未排序前性能开销小：

set autotrace traceonly

set linesize 1000

drop table t purge;

create table t as select * from dba_objects;

set autotrace traceonly

select * from t where object_id>2;

排序后开销变大：

select * from object_id2 order by object_id;

5.2.1.8不可不说的主外键设计

第一：主键本身是一种索引，

第二：可以保证表中主键所在列的唯一性,

第三：可以有效地限制外键依赖的表的记录的完整性。

1.外键上的索引与性能

t_p设主键，t_c设外键。

drop table t_p cascade constraints purge;

drop table t_c cascade constraints purge;

create table t_p(id number,name varchar2(30));

alter table t_p add constraint t_p_id_pk primary key (id);

create table t_c (id number,fid number,name varchar2(30));

alter table t_c add constraint fk_t_c foreign key (fid) references t_p (id);

insert into t_p select rownum, table_name from all_tables;

insert into t_c select rownum, mod(rownum,1000)+1,object_name

from all_objects;

外键未建索引前的表连接性能分析：

set autotrace traceonly

set linesize 1000

select a.id,a.name,b.name from t_p a,t_c b where a.id=b.id and a.id=880;

外键建索引后的表连接性能分析：

create index ind_t_c_fid on t_c (fid);

select a.id,a.name,b.name from t_p a,t_c b where a.id=b.id and a.id=880;

逻辑读变低。

2.外键索引与锁的避免

外键有索引，没有死锁的存在：

会话1：select sid from v$mystat where rownum=1;

delete t_c where id=2;

会话2：select sid from v$mystat where rownum=1;

delete t_p where id=2000;

因为外键所在的列建有索引，避免了被锁。

3.主外键约束简单证明：

外键所在表的外键列取值必须在主表中的主键列有记录。

帐号		自动登录	找回密码
密码			注册

索引，妙不可言

浏览过的版块