Oracle database operation and maintenance scheme and optimization
Operation and maintenance optimization
This paper explains in detail how to operate and maintain Oracle database, and explains how to operate and maintain it from all aspects.
In the last article, we talked about the performance optimization of Oracle database. In this article, let's talk about the optimization of operation and maintenance.
Address of last article:
I. why and what are the contents of operation and maintenance?
The operation and maintenance of the database mainly provides a practical operation and maintenance construction mechanism in combination with the actual situation of the target system. The content covers the daily maintenance, emergency fault handling, software upgrade, etc. of the ORACLE database. Customers can customize accordingly according to the service content. We will provide comprehensive and targeted service solutions to ensure the stable, efficient and reliable operation of the customer system and achieve effective support for the business system.
II. Check the basic status of the database
Check the basic condition of the database, including:
- Check Oracle instance status
- Check Oracle service process
- Check Oracle listening process
2.1 check Oracle instance status
SQL> select instance_name,host_name,startup_time,status,database_status from v$instance;
Where "STATUS" indicates the current instance STATUS of Oracle and must be "OPEN"; "DATABASE_STATUS" indicates the current STATUS of Oracle database and must be "ACTIVE".
select name,log_mode,open_mode from v$database;
Where "LOG_MODE" indicates the current archiving mode of Oracle. "ARCHIVELOG" means that the database is running in archive mode, "NOARCHIVELOG" means that the database is running in non archive mode.
2.2 check Oracle service process
$ ps -ef|grep ora_|grep -v grep&&ps -ef|grep ora_|grep-v grep|wc –l
After checking the process command output of Oracle, the output display should include at least the following processes:
- The process of Oracle writing data files, and the output is displayed as: "ora_dbw0_CKDB"
- The process of Oracle writing log files. The output is displayed as: "ora_lgwr__ckdb"
- Oracle monitors the process of instance status, and the output is displayed as: "ora_smon_CKDB"
- Oracle monitors the process of client connection process status, and the output is displayed as: "ora_pmon_CKDB"
- The process of Oracle archiving, and the output is displayed as: "ora_arc0_ckdb"
- The process of Oracle checkpoint, and the output is displayed as: "ora_ckpt_ckdb"
- The process of Oracle recovery, and the output is displayed as: "ora_reco_ckdb"
2.3 check Oracle listening status
The "Services Summary" item indicates which database instances the Oracle listening process is listening to. At least the "CKDB" item should be displayed in the output.
Check whether the listening process exists:
[oracle@AS14 ~]$ ps -ef|grep lsn|grep -v grep
III. check the system and oracle log files
Check the relevant log files, including: check the log files of the operating system, check the Oracle log files, check the Oracle core dump directory, and check the email of the Root user and Oracle user. There are four parts in total.
3.1 check the operating system log file
cat /var/log/messages |grep failed
Check for error messages related to Oracle users.
3.2 check oracle log files
[oracle@AS14 ~]$ cat /data/oracle/admin/CKDB/bdump/alert_CKDB.log |grep ora- [oracle@AS14 ~]$ cat /data/oracle/admin/CKDB/bdump/alert_CKDB.log |grep err [oracle@AS14 ~]$ cat /data/oracle/admin/CKDB/bdump/alert_CKDB.log |grep fail
During the operation of Oracle, some operation conditions of the database will be recorded in the warning log file (alert_SID.log): startup and shutdown of the database, and non default parameters during startup; The redo log switching of the database, recording the time of each switching, and if the switching cannot be performed because the checkpoint operation is not completed, the reason why the switching cannot be performed will be recorded; Some operations on the database, such as creating or deleting table spaces and adding data files; Errors occurred in the database, such as insufficient table space, bad blocks, internal database errors (ORA-600), etc. Check the log files regularly and deal with the problems found in the log in time:
|Incorrect startup parameters||Check initialization parameter file|
|The redo log cannot be switched because the checkpoint operation or archive operation is not completed||If this happens frequently, consider adding redo log file groups; Find ways to improve the efficiency of checkpoint or archiving operations;|
|Someone deleted the tablespace without authorization||Check the security of the database and whether the password is too simple; If necessary, revoke the system permissions of some users|
|Bad block||Check whether there is a hardware problem (such as a bad block in the disk itself). If not, check whether there is a bad block in the database object, and rebuild the object|
|Insufficient table space||Add the data file to the corresponding table space|
|ORA-600||Check the corresponding TRC file according to the contents of the log file. If it is an Oracle bug, patch it in time|
3.3 check Oracle core dump directory
$ls $ORACLE_BASE/admin/CKDB/cdump/*.trc|wc -l $ls $ORACLE_BASE/admin/CKDB/udump/*.trc|wc –l
If the result of the above command increases every day, it indicates that Oracle processes often have core dumps. This indicates that some user processes or database background processes exit abnormally due to reasons that cannot be handled. Frequent core dump, especially the core dump of database background process, will lead to abnormal termination of database.
3.4 check email of Root user and Oracle user
tail –n 200 /var/mail/root tail –n 200 /var/mail/oracle
Check for error messages related to Oracle users.
IV. check Oracle object status
Check the status of relevant Oracle objects, including: check the status of Oracle control files, check the status of Oracle online logs, check the status of Oracle tablespaces, check the status of all Oracle data files, check the status of all Oracle tables, indexes, stored procedures, triggers, packages and other objects, and check the status of all Oracle rollback segments. There are six parts in total.
4.1 check Oracle control file status
select status,name from v$controlfile;
STATUS NAME /data/oradata/CKDB/control01.ctl /data/oradata/CKDB/control02.ctl /data/oradata/CKDB/control03.ctl
The output result should have more than 3 records (including 3), and "STATUS" should be empty. If the STATUS is empty, the control file STATUS is normal.
4.2 check Oracle online log status
select group#,status,type,member from v$logfile;
The output result should have more than 3 (including 3) records, "STATUS" should be non "INVALID" and non "DELETED". Note: if "STATUS" is blank, it means normal.
4.3 check the status of Oracle tablespaces
SQL> select tablespace_name,status from dba_tablespaces;
The STATUS in the output result should all be ONLINE
4.4 check the status of all Oracle data files
SQL> select name,status from v$datafile;
"STATUS" in the output result should be "ONLINE".
SQL> select file_name,status from dba_data_files;
"STATUS" in the output result should be "AVAILABLE".
4.5 check invalid objects
select owner,object_name,object_type from dba_objects where status!='VALID' and owner!='SYS' and owner!='SYSTEM';
If a record is returned, it indicates that there is an invalid object. If these objects are related to the application, you need to recompile and generate this object, or:
SELECT owner, object_name, object_type FROM dba_objects WHERE status= 'INVALID';
4.6 check the status of all rollback segments
SQL> select segment_name,status from dba_rollback_segs;
The "STATUS" of all rollback segments in the output result should be "ONLINE".
V. check the relevant parameter values in the Oracle initialization file
Check the usage of Oracle related resources, including: check the relevant parameter values in the Oracle initialization file, check the database connection, check the system disk space, check the usage of Oracle table space, check some objects with abnormal expansion, check the contents in the system table space, check the next expansion of the object and the maximum expansion value of the table space, There are seven parts in total.
5.1 check relevant parameter values in Oracle initialization file
SQL> select resource_name,max_utilization,initial_allocation, limit_value from v$resource_limit;
If limit_ VALU-MAX_ If utilization < = 5, it indicates that it is related to RESOURCE_NAME related
Oracle initialization parameters need to be adjusted. You can modify the Oracle initialization parameter file
$ORACLE_BASE/admin/CKDB/pfile/initORCL.ora to modify.
5.2 check database connection
Check whether the number of current session connections is within the normal range.
select sid,serial,username,program,machine,status from v$session;
Of which: SID conversation(session)of ID number; SERIAL# The serial number of a session, which is used together with SID to uniquely identify a session; USERNAME User name for establishing the session; PROGRAM What tools are used to connect this session to the database; STATUS The current state of this session, ACTIVE Indicates that the session is performing some tasks, INACTIVE Indicates that the current session has not performed any operation;
If too many connections are established, the resources of the database will be consumed. At the same time, some "hanging" connections may need to be cleaned up manually. If the DBA wants to manually disconnect a session, Execute: (it is generally not recommended to use this method to kill the connection to the database, so that sometimes the session will not be disconnected. It is easy to cause dead connection. It is recommended to check the spid of the operating system through sid, and use ps – ef|grep spidno to confirm that the spid is not the background process of ORACLE. Use the kill -9 command of the operating system to kill the connection)
alter system kill session 'SID,SERIAL#';
Note: in the above example, sessions with SID from 1 to 10 (the username column is empty) are Oracle background processes. Do not operate on these sessions.
5.3 check the system disk space
If the remaining space of the file system is too small or growing rapidly, confirm it and delete unused files to free up space.
5.4 check table space usage
SQL> select f.tablespace_name,a.total,f.free,round((f.free/a.total)*100) "% Free" from (select tablespace_name, sum(bytes/(1024*1024)) total from dba_data_files group by tablespace_name) a, (select tablespace_name, round(sum(bytes/(1024*1024))) free from dba_free_space group by tablespace_name) f WHERE a.tablespace_name = f.tablespace_name(+) order by "% Free";
If the idle rate% Free is less than 10% (including 10%), pay attention to adding data files to expand the table space instead of using the automatic expansion function of data files. Please do not add too many data files to the table space. The principle of adding data files is that the size of each data file is 2G or 4G, and the maximum limit of automatic expansion is 8G.
5.5 check some objects with extension exceptions
select Segment_Name, Segment_Type, TableSpace_Name, (Extents/Max_extents)*100 Percent From sys.DBA_Segments Where Max_Extents!= 0 and (Extents/Max_extents)*100>=95 order By Percent;
If a record is returned, the extension of these objects has almost reached the maximum extension value when it is defined. For these objects, you need to modify its storage structure parameters.
5.6 check the contents in the system tablespace
select distinct(owner) from dba_tables where tablespace_name='SYSTEM' and owner!='SYS' and owner!='SYSTEM' union select distinct(owner) from dba_indexes where tablespace_name='SYSTEM' and owner!='SYS' and owner!='SYSTEM';
If the record returns, it indicates that there are some objects of non system and sys users in the system table space. We should further check whether these objects are relevant to our application. If relevant, move these objects to non system tablespaces, and check the default tablespace values of the owners of these objects.
5.7 check the next extension of the object and the maximum extension value of the table space
select a.table_name, a.next_extent, a.tablespace_name from all_tables a, (select tablespace_name, max(bytes) as big_chunk from dba_free_space group by tablespace_name ) f where f.tablespace_name = a.tablespace_name and a.next_extent > f.big_chunk union select a.index_name, a.next_extent, a.tablespace_name from all_indexes a, (select tablespace_name, max(bytes) as big_chunk from dba_free_space group by tablespace_name ) f where f.tablespace_name = a.tablespace_name and a.next_extent > f.big_chunk;
If a record is returned, it indicates that the next extension of these objects is greater than the maximum extension value of the table space to which the object belongs. It is necessary to adjust the storage parameters of the corresponding table space.
Vi. check database security
Checking the security of Oracle database includes two parts: checking the system security information and changing the password regularly.
6.1 check system security log information
The directory of the system security log file is under / var/log, which mainly checks the user log information of successful or failed login.
Check log of successful login:
[root@rac2 ~] grep -i accepted /var/log/secure Jan 8 08:44:43 rac2 sshd: Accepted password for root from ::ffff:10.10.10.6 port 1119 ssh2......
Check log of login failures:
[root@rac2 ~] grep -i inval /var/log/secure &&grep-i failed /var/log/secure Jan 9 10:30:44 rac2 sshd: Invalid user ydbuser from ::ffff:192.168.3.5 Jan 9 10:30:56 rac2 sshd: Failed password for invalid user ydbuser from ::ffff:192.168.3.5 port 36005 ssh2 Jan 9 10:30:56 rac2 sshd: Failed password for invalid user ydbuser from ::ffff:192.168.3.5 port 36005 ssh2 Jan 10 22:44:38 rac2 sshd: Failed password for root from ::ffff:10.10.10.6 port 1723 ssh2
There is no error (Invalid, rejected) prompt in the log information. If there is no (Invalid
refused) it is deemed that the system is normal. In case of error prompt, a system alarm notice shall be given.
5.2 check the user's modified password
There are often many users in the database system, such as the third-party database monitoring system, the demonstration user during the initial installation of the database, the administrator user, etc. the passwords of these users are often written, known by many people, and will be used by people with ulterior motives to attack the system and even modify the data. Users who need to change their password include:
- Database administrator user SYS, SYSTEM; Other users.
- After logging in the system, enter cat /etc/passwd at the prompt to check whether there are accounts that are no longer used or unfamiliar among the listed users. If it exists, it is recorded as an exception.
Password modification method:
Sql>alter user USER_NAME identified by PASSWORD;
VII. Other inspections
Check whether the current crontab task is normal and whether the Oracle Job fails.
7.1 does Oracle job fail
select job,what,last_date,next_date,failures,broken from dba_jobs where schema_user='CAIKE';
If there are problems, it is recommended to rebuild the job, such as exec sys dbms_ job. remove(1); commit; exec sys. dbms_ job. isubmit(1,‘REFRESH_ALL_SNAPSHOT;’,SYSDATE+1/1440,‘SYSDAT E+4/1440’); commit;
7.2 monitor the growth of data volume
SQL> select A.tablespace_name,(1-(A.total)/B.total)*100 used_percent from (select tablespace_name,sum(bytes) total from dba_free_space group by tablespace_name) A, (select tablespace_name,sum(bytes) total from dba_data_files group by tablespace_name) B where A.tablespace_name=B.tablespace_name;
According to the daily inspection this week, find the database objects with fast space expansion and take corresponding measures:
- Delete historical data
The historical data of at least 6 months is reserved in the mobile database, so the previous historical data can be backed up and then cleared to free up its resource space.
- Extended table space
alter tablespace <tablespace_name> add datafile '<file>' size <size> autoextend off;
Note: when the database structure changes, such as adding table space, adding data files or redo log files, these operations will cause changes in the Oracle database control file. DBA should backup the control file in time. The backup method is:
Execute SQL statement:
alter database backup controlfile to '/home/backup/control.bak';
alter database backup controlfile to trace;
In this way, it will be in user_ DUMP_ Generate the SQL command to create the control file under the DeST (specified in the initialization parameter file).
7.3 index of inspection failure
Sql>select index_name,table_name,tablespace_name,status From dba_indexes Where owner='CTAIS2' And status<>'VALID';
Note: it is normal that the index status on the partition table is N/A. if there is an invalid index, rebuild the index, such as SQL > ALTER INDEX index index_ NAME rebuild tablespace TABLESPACE_ NAME;
7.4 check for inoperative constraints
SELECT owner, constraint_name, table_name, constraint_type, status FROM dba_constraints WHERE status ='DISABLE' and constraint_type='P';
Enable if there is a failure constraint, such as:
Sql>alter Table TABLE_NAME Enable Constraints CONSTRAINT_NAME;
7.5 check for invalid trigger s
SELECT owner, trigger_name, table_name, status FROM dba_triggers WHERE status = 'DISABLED';
Enable if there is a failure trigger, such as:
Sql>alter Trigger TRIGGER_NAME Enable;
If there are any mistakes in this article, please point them out in your comments or contact me, and I will correct them. If you think this article is useful, please help one click three times to let more people see it. Thank you
Author yang_z_1 csdn blog address: https://blog.csdn.net/yang_z_1?type=blog