标题: Oracle Cluster Health Monitor (CHM) using large amount of space [打印本页] 作者: 郑全 时间: 2021-5-17 11:48 标题: Oracle Cluster Health Monitor (CHM) using large amount of space Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc ID 1343105.1)
In this Document
Symptoms
Cause
Solution
References
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Information in this document applies to any platform.
SYMPTOMS
Cluster Health Monitor (CHM) files in $GI_HOME/crf/db/<node name? directory are filling up disk space in GI_HOME.
The bdb files in $GRID_HOME/crf/db/<node name> are larger than 1GB (default size) and filling up the GI_HOME file system.
CAUSE
Oracle Cluster Health Monitor (CHM) using large amount of space when it is collecting the OS statistics.
Check Cluster Health Monitor berkerley database files in $GI_HOME/crf/db/<node name> directory
SOLUTION
Remove those large Berkeley database files to free up space by doing the following as root:
$GI_HOME/bin/crsctl stop res ora.crf -init
cd $GI_HOME/crf/db/<nodename>
rm *.bdb
$GI_HOME/bin/crsctl start res ora.crf -init
Please note that bdb files get regenerated when CHM (ora.crf) resource is restarted. The files are owned by root, so only root can delete the bdb files. Other than losing the OS statistics that CHM has gathered, deleting bdb files does not have other impact. CHM will start collecting the OS statistics again.
One reason for the large bdb files is that the retention period is large. Issue "GI_HOME/bin/oclumon.pl" to get the current value of the retention period that is given in seconds. Issue "oclumon manage -repos resize <desired value in seconds>" to change the retention period. One example is "oclumon manage -repos resize 259200" that retains the CHM data for 3 days (259200 seconds is the number of seconds in 3 days)
Another reason for having very large bdb files (greater than 2GB) is due to a bug since the default size limits the bdb to 1GB unless the CHM data retention time is increased. One such bug is 10165314.
Also, please note that the local bdb file (<hostname>.ldb) may need to be deleted as well.