Shell Script to Perform a Database Health Check
In this post I am sharing a shell script to do a full database health check and create a html report.
Assumption:
You should be aware of shell scripts.
You need to make changes before running in any of your environment.
You should be understanding Oracle Database.
Script is built for non-CDB environment and tested in same.
Note: Script will be helping in saving tie, but use it wisely by understanding what is being done.
Blindly using script will not make you a good DBA.
Report Sample Screenshot(Only one Portion)
Script:
#!/bin/bash# Oracle environment variables (adjust as necessary)#export ORACLE_SID=your_sid#export ORACLE_HOME=/path/to/oracle_home#export PATH=$ORACLE_HOME/bin:$PATH. ~/.bash_profile# Paths to alert log and timestamp fileALERT_LOG="/u01/db/TRAIN/12.1.0/admin/TRAIN_ebstraining/diag/rdbms/train/TRAIN/trace/alert_TRAIN.log"TIMESTAMP_FILE="/tmp/last_checked_timestamp.txt"# HTML report fileREPORT_FILE="health_check_report.html"# Function to run SQL queriesrun_sqlplus() {sqlplus -s / as sysdba <<EOFset heading offset feedback offset pagesize 0set verify offset echo offset linesize 1000set trimspool on$1exit;EOF}# Function to get the database nameget_database_name() {run_sqlplus "SELECT name FROM v\$database;" | awk '{print "<p>Database Name: " $0 "</p>"}'}# Function to check top 10 inactive sessionscheck_top_inactive_sessions() {run_sqlplus "SET LINESIZE 1000COLUMN username FORMAT A10COLUMN osuser FORMAT A10COLUMN machine FORMAT A20COLUMN program FORMAT A40SELECT * FROM (SELECT s.sid, s.serial#, s.username, s.status, s.osuser, s.machine, replace(s.program,' ','')FROM v\$session sWHERE s.status = 'INACTIVE'ORDER BY s.last_call_et DESC) WHERE ROWNUM <= 10;" | awk 'BEGIN{print "<h3>Top 10 Inactive Sessions</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>SID</th><th>Serial#</th><th>Username</th><th>Status</th><th>OS User</th><th>Machine</th><th>Program</th></tr>"} {print "<tr><td>" $1 "</td><td>" $2 "</td><td>" $3 "</td><td>" $4 "</td><td>" $5 "</td><td>" $6 "</td><td>" $7 "</td></tr>"} END{print "</table>"}'}# Function to check database statuscheck_database_status() {run_sqlplus "SELECT open_mode FROM v\$database;" | awk '{print "<p>Database Status: " $0 "</p>"}'}# Function to check archive modecheck_archive_mode() {run_sqlplus "SELECT log_mode FROM v\$database;" | awk '{print "<p>Archive Mode: " $0 "</p>"}'}# Function to check for invalid objectscheck_invalid_objects() {run_sqlplus "SET LINESIZE 200COLUMN owner FORMAT A30COLUMN object_name FORMAT A50SELECT owner, object_name, object_typeFROM dba_objectsWHERE status = 'INVALID';" | awk 'BEGIN{print "<h3>Invalid Objects</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Owner</th><th>Object Name</th><th>Object Type</th></tr>"} {print "<tr><td><pre>" $1 "</pre></td><td><pre>" $2 "</pre></td><td><pre>" $3 "</pre></td></tr>"} END{print "</table>"}'}# Function to check scheduled jobscheck_scheduled_jobs() {run_sqlplus "SET LINESIZE 200COLUMN owner FORMAT A30COLUMN job_name FORMAT A50COLUMN next_run_date FORMAT A20SELECT owner, job_name, REPLACE(TO_CHAR(next_run_date, 'DD-MON-YYYY HH24:MI:SS'),' ','') next_run_dateFROM dba_scheduler_jobs;" | awk 'BEGIN{print "<h3>Scheduled Jobs</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Owner</th><th>Job Name</th><th>Next Run Date</th></tr>"} {print "<tr><td><pre>" $1 "</pre></td><td><pre>" $2 "</pre></td><td><pre>" $3 "</pre></td></tr>"} END{print "</table>"}'}# Function to check RMAN backup statuscheck_rman_backup() {run_sqlplus "SET LINESIZE 200COLUMN input_type FORMAT A10COLUMN status FORMAT A15COLUMN start_time FORMAT A25COLUMN end_time FORMAT A25COLUMN input_bytes FORMAT A15COLUMN output_bytes FORMAT A15SELECTREPLACE(TRIM(input_type), ' ', '') AS input_type,REPLACE(TRIM(status), ' ', '') AS status,REPLACE(TO_CHAR(start_time, 'DD-MON-YYYY HH24:MI:SS'),' ','') AS start_time,REPLACE(TO_CHAR(end_time, 'DD-MON-YYYY HH24:MI:SS'),' ','') AS end_time,REPLACE(TRIM(ROUND(input_bytes / 1024 / 1024, 2)) || ' MB', ' ', '') AS input_bytes,REPLACE(TRIM(ROUND(output_bytes / 1024 / 1024, 2)) || ' MB', ' ', '') AS output_bytesFROMv\$rman_backup_job_detailsORDER BYstart_time DESC;" | awk 'BEGIN{print "<h3>RMAN Backup Status</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Input Type</th><th>Status</th><th>Start Time</th><th>End Time</th><th>Input Bytes</th><th>Output Bytes</th></tr>"} {print "<tr><td><pre>" $1 "</pre></td><td><pre>" $2 "</pre></td><td><pre>" $3 "</pre></td><td><pre>" $4 "</pre></td><td><pre>" $5 "</pre></td><td><pre>" $6 "</pre></td></tr>"} END{print "</table>"}'}# Function to check database uptimecheck_database_uptime() {run_sqlplus "SELECT TO_CHAR(startup_time, 'DD-MON-YYYY HH24:MI:SS') FROM v\$instance;" | awk '{print "<p>Database Uptime: " $0 "</p>"}'}# Function to check tablespace utilization considering AUTOEXTENDcheck_tablespace_utilization() {run_sqlplus "SET LINESIZE 1000COLUMN tablespace_name FORMAT A20COLUMN total_space_mb FORMAT 999999.99COLUMN used_space_mb FORMAT 999999.99COLUMN free_space_mb FORMAT 999999.99COLUMN pct_used FORMAT 999.99COLUMN max_space_mb FORMAT 999999.99COLUMN max_free_space_mb FORMAT 999999.99COLUMN pct_max_used FORMAT 999.99SELECTdf.tablespace_name,ROUND(df.bytes / (1024 * 1024), 2) total_space_mb,ROUND((df.bytes - nvl(fs.bytes, 0)) / (1024 * 1024), 2) used_space_mb,ROUND(nvl(fs.bytes, 0) / (1024 * 1024), 2) free_space_mb,ROUND(((df.bytes - nvl(fs.bytes, 0)) / df.bytes) * 100, 2) pct_used,ROUND(nvl(df.maxbytes, df.bytes) / (1024 * 1024), 2) max_space_mb,ROUND((nvl(df.maxbytes, df.bytes) - (df.bytes - nvl(fs.bytes, 0))) / (1024 * 1024), 2) max_free_space_mb,ROUND(((df.bytes - nvl(fs.bytes, 0)) / nvl(df.maxbytes, df.bytes)) * 100, 2) pct_max_usedFROM(SELECT tablespace_name, SUM(bytes) bytes, SUM(DECODE(autoextensible, 'YES', maxbytes, bytes)) maxbytesFROM dba_data_filesGROUP BY tablespace_name) df,(SELECT tablespace_name, SUM(bytes) bytesFROM dba_free_spaceGROUP BY tablespace_name) fsWHERE df.tablespace_name = fs.tablespace_name(+);" | awk 'BEGIN{print "<h3>Tablespace Utilization</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Tablespace</th><th>Total MB</th><th>Used MB</th><th>Free MB</th><th>% Used</th><th>Max MB</th><th>Max Free MB</th><th>% Max Used</th></tr>"} {print "<tr><td>" $1 "</td><td>" $2 "</td><td>" $3 "</td><td>" $4 "</td><td>" $5 "</td><td>" $6 "</td><td>" $7 "</td><td>" $8 "</td></tr>"} END{print "</table>"}'}# Function to check server mount pointscheck_server_mounts() {df -h | awk 'BEGIN{print "<h3>Server Mount Points</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Filesystem</th><th>Size</th><th>Used</th><th>Avail</th><th>Use%</th><th>Mounted on</th></tr>"} {print "<tr><td>" $1 "</td><td>" $2 "</td><td>" $3 "</td><td>" $4 "</td><td>" $5 "</td><td>" $6 "</td></tr>"} END{print "</table>"}'}# Function to check OS-level resource usagecheck_os_resources() {echo "<h3>OS Resource Usage</h3><pre>$(top -b -n 1 | head -n 20)</pre>"# Function to check memory utilizationecho "<h3>Memory Utilization</h3>"echo "<table style=\"border-collapse: collapse; width: 100%;\" border=\"1\">"echo "<tr><th>Type</th><th>Total (MB)</th><th>Used (MB)</th><th>Free (MB)</th><th>Shared (MB)</th><th>Buffer/Cache (MB)</th><th>Available (MB)</th></tr>"# Get memory statisticsmem_stats=$(free -m | awk 'NR==2 {printf "<tr><td>Memory</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>", $2, $3, $4, $5, $6, $7}')echo $mem_stats# Get swap statisticsswap_stats=$(free -m | awk 'NR==3 {printf "<tr><td>Swap</td><td>%s</td><td>%s</td><td>%s</td><td>-</td><td>-</td><td>-</td></tr>", $2, $3, $4}')echo $swap_statsecho "</table>"}# Function to validate the database alert log and identify ORA error codescheck_alert_log_for_errors() {echo "<h3>Database Alert Log Errors</h3>"echo "<table style=\"border-collapse: collapse; width: 100%;\" border=\"1\">"echo "<tr><th>Timestamp</th><th>Error Message</th></tr>"# Get the last checked timestampif [ -f "$TIMESTAMP_FILE" ]; thenlast_checked=$(cat "$TIMESTAMP_FILE")elselast_checked="Sat May 11 00:00:00 2024"fi# Update the timestamp file with the current timecurrent_time=$(date '+%a %b %d %H:%M:%S %Y')echo "$current_time" > "$TIMESTAMP_FILE"# Convert last checked timestamp to seconds since epoch for comparisonlast_checked_epoch=$(date -d "$last_checked" '+%s')# Initialize a variable to keep track of the current log timestampcurrent_log_timestamp=""# Search for ORA errors in the alert log since the last checked timewhile IFS= read -r line; do# Check if the line contains a timestampif [[ $line =~ ^[A-Za-z]{3}\ [A-Za-z]{3}\ [0-9]{2}\ [0-9]{2}:[0-9]{2}:[0-9]{2}\ [0-9]{4}$ ]]; thencurrent_log_timestamp=$linecurrent_log_epoch=$(date -d "$current_log_timestamp" '+%s')fi# If the current log entry is after the last checked timestamp, check for ORA errorsif [[ -n $current_log_timestamp && $current_log_epoch -ge $last_checked_epoch ]]; thenif [[ $line =~ ORA-[0-9]{5} ]]; thenerror_message=$lineecho "<tr><td>$current_log_timestamp</td><td>$error_message</td></tr>"fifidone < "$ALERT_LOG"echo "</table>"}find_listener_process() {listener_process=$(ps aux | grep tnslsnr | grep $ORACLE_HOME | grep -v grep)if [ -z "$listener_process" ]; thenecho "<p>No listener process found for Oracle Home: $ORACLE_HOME</p>"returnfiecho "<h3>Listener Process Details</h3>"echo "<p>Listener process is running for Oracle Home: $ORACLE_HOME</p>"echo "<pre>$listener_process</pre>"}# Function to identify blocking sessions in the Oracle databasefind_blocking_sessions() {# Query to find blocking sessionsrun_sqlplus "set heading off feedback off pagesize 0 verify off echo offselect s1.username || '@' || s1.machine || ' ( SID=' || s1.sid || ' ) is blocking ' || s2.username || '@' || s2.machine || ' ( SID=' || s2.sid || ' ) ' AS blocking_statusfrom v\$lock l1, v\$session s1, v\$lock l2, v\$session s2where s1.sid=l1.sid and s2.sid=l2.sidand l1.BLOCK=1 and l2.request > 0and l1.id1 = l2.id1and l2.id2 = l2.id2 ;"| awk 'BEGIN{print "<h3>Blocking Sessions</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Blocking Details</th></tr>"} {print "<tr><td>" $1 "</td></tr>"} END{print "</table>"}'}# Function to retrieve resource limit utilization and max utilization in the databaseget_resource_utilization() {# Query to retrieve resource utilizationrun_sqlplus "col RESOURCE_NAME for a30col LIMIT_VALUE for a10select resource_name, current_utilization, max_utilization, limit_value from v\$resource_limit;" | awk 'BEGIN{print "<h3>Resource Utilization</h3><table style=\"border-collapse: collapse; width: 100%;\" border=\"1\"><tr><th>Resource Name</th><th>Current Utilisation</th><th>Max Utilisation</th><th>Limit Value</th></tr>"} {print "<tr><td>" $1 "</td><td>" $2 "</td><td>" $3 "</td><td>" $4 "</td></tr>"} END{print "</table>"}'}# Generate HTML report{echo "<html><head><title>Oracle Database Health Check Report</title>"echo "<style>"echo "body { font-family: Arial, sans-serif; }"echo "table { width: 100%; border-collapse: collapse; }"echo "th, td { padding: 8px 12px; text-align: left; }"echo "th { background-color: #f2f2f2; }"echo "tr:nth-child(even) { background-color: #f9f9f9; }"echo "</style>"echo "</head><body>"echo "<h2>Oracle Database Health Check Report</h2>"echo "$(get_database_name)"echo "$(check_database_status)"echo "$(find_listener_process)"echo "$(check_archive_mode)"echo "$(check_database_uptime)"echo "<h3>RMAN Backup Status</h3><pre>$(check_rman_backup)</pre>"echo "$(find_blocking_sessions)"echo "$(get_resource_utilization)"echo "$(check_tablespace_utilization)"echo "$(check_top_inactive_sessions)"echo "$(check_invalid_objects)"echo "$(check_scheduled_jobs)"echo "$(check_server_mounts)"echo "$(check_os_resources)"echo "$(check_alert_log_for_errors)"echo "</body></html>"} > "$REPORT_FILE"echo "Health check report generated at $REPORT_FILE"
Execution:
[oracle@ebstraining ~]$ sh db_health.sh
Health check report generated at health_check_report.html
Post a Comment
Post a Comment