ÿþ<html> <title>antapex.org : Solving Oracle Block Corruption</title> <body> <h1>Some notes on handling Oracle corruption</h1> <B>Version</B> : 1.1<br> <B>Date</B> : 28/10/2010<br> <B>By</B> : Albert van der Sel<br> <B>Type of doc</B> : It's just a few notes on Oracle Block and File corruption in very simple words. It's no more than "entry level".<br> <B>For who</B> : For anyone who likes a short orientation on the subject.<br> <hr/> <br> <br> <font face="arial" size=2 color="black"> <B>Important:</B><br> <ol> <li>Obviously, when you experience some <B>serious problem</B> with an Oracle Database (which you cannot solve),<br> you would then ofcourse contact Oracle Support for advice and help.</li><br> <li>So, this document can only "at best" be viewed as "supplementary".</li><br> <li>Also, this document may contain procedures which formally may not be executed without consulting<br> with Oracle Support.</li><br> <li>Any note found on the internet, including this one, intrinsically will have a certain "risk" involved.</li><br> <li>Always try recovery procedures first on testsystems, to prepare you for the "real challenges" on production systems.</li><br> <li>Never perform any of the commands or tools listed here, on production systems, without sufficient testing on test systems.</li><br> <li>For experimenting on testsystems, I believe this document is usable.</li><br> </ol> <font face="arial" size=2 color="red"> <B>In short:<br> <br> You are encouraged to follow this note, and play around on a testsystem, where you would not care if it breaks down.<br> But, you should only view this note as a (lightweight) demo for certain simple types of recoveries of corruption, performed on testsystems only.<br> </B> <br> <font face="arial" size=2 color="black"> <br> Although you do not need to be an expert, it is assumed that you have at least a "reasonable" level of knowledge of Oracle.<br> <br> But a note like this one, has some severe limitations as well.<br> <br> (1): For example, just take in consideration on how many platforms Oracle can run on. Just to name a few examples:<br> Unix versions (like AIX, HPUX, Solaris), Linux distro's, Windows versions, and some more "exotic" machines like Z, VMS etc..<br> The question is: is it really important? Well.., for this simple note not so terribly much.<br> But it must be said, that for certain types of errors, the solutions might be different per platform.<br> And, generally, the "deeper" you delve into a problem, the more the "specifics" of a certain platform, will play a role.<br> <br> (2): Secondly, an RDBMS like Oracle can use many sorts of "filesystems", for example the plain old "true" filesystems, or "raw" volumes,<br> and ofcourse the Oracle specific ASM storage.<br> The question is: is it really important? Well.., again, for this simple note not so terribly much.<br> But here too it must be said, that for certain types of errors, the solutions might be different per type of storage.<br> And, generally, the "deeper" you delve into a problem, the more the "specifics" of a certain storage type, will play a role.<br> <br> (3): Thirdly, Oracle knows many types of objects (tables, indexes, LOB's, IOT's, clusters etc..). This note considers<br> only regular tables and indexes.<br> <br> You see?:<br> For example, a document that deals with MS SQL Server, that can run on <I>just one type of Operating System (Windows)</I>,<br> will be automatically "generic".<br> On the other hand, a document that deals with a RDBMS that can run on <I>many types of Operating Systems,</I><br> is NOT "generic", if the author just only considers one type of Operating System (which I do).<br> <br> It's no joke: somebody really could write a whole book on corruptions in Oracle and how to deal with them.<br> <br> In any case, I tell you,<I> that for the scope of this document</I>, the differences among the platforms will really<br> not bother us <I>too</I> much.<br> <br> In this note, I will take the Windows platform, with Oracle 11g, and using the familiar regular filesystems.<br> <br> - Here, I choose Windows, since practically everybody has acces to a Windows machine, that's capable of running Oracle 9i, 10g or 11g.<br> - About the Oracle versions: although many differences exists between 10g and 11g, again for the scope of this note,<br> the choice between 10g or 11g, does not bother us too much.<br> <br> So what can you expect to find in this note?<br> <br> Chapter 1 just describes the creation of a 'SALES' database which will be used to simulate several sorts of corruption.<br> Chapter 2 will demonstrate a few examples of a soft and hard corruption of data blocks.<br> Chapter 3 actually is the "body" of this note. It will show you how to repair an object with corrupt blocks.<br> <br> <br> <B>Contents:</B><br> <br> <B>Chapter 1. Creating the example database, and getting a hex editor.</B><br> <br> <B>Chapter 2. Examples of soft and hard block corruption (let's do it ourselves).</B><br> <br> - 2.1. Soft block corruption.<br> - 2.2. Soft corruption of a Database block, and the DB_BLOCK_CHECKING init.ora parameter.<br> - 2.3. Analyzing Tables and Files for corruption (using "dbv" and "ANALYZE TABLE").<br> - 2.4. Hard corruption of a Database block (writing all zero's).<br> <br> <B>Chapter 3. Repairing objects with corrupt blocks.</B><br> <br> - 3.1. The DBMS_REPAIR.SKIP_CORRUPT_BLOCKS() procedure (quick fix)<br> - 3.2. The DBMS_REPAIR.FIX_CORRUPT_BLOCKS() procedure.<br> - 3.3. The ORA-1578 and ORA-8103 errors.<br> - 3.3. Using SET EVENTS parameters.<br> - 3.5. Querying "around" bad blocks.<br> - 3.6. Some remarks on the structure of a block.<br> - 3.7. Using tools and statements to "dump" blocks.<br> <br> <br> <br> <h2>Chapter 1. Creating the example database, and getting a hex editor.</h2><br> <h3>1.1 Creating a Database and some Objects:</h3><br> If you have installed Oracle 10g/11g, you might already have an Instance (and database) running.<br> If not, I would like you to create a database, preferrably with the name "sales".<br> The database name is really not so important, but if you need to create a database anyway (in order to practice<br> this note), you may as well call it "sales". Furthermore, if possible, place the database files in "c:\oradata\sales".<br> Here too, the exact location of the files is not really important. If you have placed the database files elsewhere,<br> just remember to "translate" my location (which is "c:\oradata\sales") to whatever you use.<br> <br> But, in case you still need to create an Instance/database, just start the "Database Configuration Wizard",<br> which you can find in your Windows Start menu, or just open a cmd prompt and enter "dbca".<br> using the "Database Configuration Wizard", create a database, and you can use pretty much use the defaults<br> as suggested by the Wizard.<br> Again, (if possible) except for the disklocation of the database files, and, (if possible) choose a database blocksize of 8192 bytes.<br> The blocksize, in principle, could be 4K, 8K, 16K, 64K etc.., but depending on the details of your storage, the Wizard will<br> not always allow you to pick another blocksize, other as what it will come up with.<br> <br> Once the database is created, we will create a couple of Tablespaces and a few objects on those tablespaces.<br> <br> <I>Note: if you are familiar with SQL Server: An Oracle Tablespace resembles the function of a SQL Server "Filegroup".</I><br> <br> There are many ways to create tablespaces and objects like tables. In 10g/11g, a Webbased graphical "Enterprise manager"<br> gives you almost complete control over the Instance and database. Here, we will just use a 'prompt-oriented" sqlplus utility:<br> <br> - Start a cmd prompt<br> - type "set ORACLE_SID=SALES" (without the ""). Ofcourse, if your database is named otherwise, then use that name.<br> - Enter: sqlplus /nolog<br> - Hopefully, an "SQL>" prompt appears. If not, the utility is probably not in your "environment" or "path".<br> - Now, we need to "connect" (or logon) to the Instance. Try this:<br> <br> <font face="courier" size=2 color="black"> SQL> connect sys/password@sales as sysdba <I>replace "password" with the true password of "sys"</I><br> Connected.<br> SQL><br> <br> <font face="courier" size=2 color="blue"> Note: If the above logon did not worked, try:<br> SQL> connect / as sysdba<br> <br> <font face="arial" size=2 color="black"> Now, just type the following commands (or use copy/paste):<br> <br> <font face="courier" size=2 color="black"> CREATE TABLESPACE SALES_DATA_1 DATAFILE 'C:\ORADATA\SALES\SALES_DATA_01.DBF' SIZE 50M<br> EXTENT MANAGEMENT LOCAL AUTOALLOCATE<br> SEGMENT SPACE MANAGEMENT AUTO;<br> <br> CREATE TABLESPACE SALES_DATA_2 DATAFILE 'C:\ORADATA\SALES\SALES_DATA_02.DBF' SIZE 50M<br> EXTENT MANAGEMENT LOCAL AUTOALLOCATE<br> SEGMENT SPACE MANAGEMENT AUTO;<br> <br> CREATE TABLESPACE SALES_INDEX_1 DATAFILE 'C:\ORADATA\SALES\SALES_INDEX_01.DBF' SIZE 50M<br> EXTENT MANAGEMENT LOCAL AUTOALLOCATE<br> SEGMENT SPACE MANAGEMENT AUTO;<br> <br> CREATE USER TESTUSER IDENTIFIED BY TESTUSER<br> DEFAULT TABLESPACE SALES_DATA_1<br> TEMPORARY TABLESPACE TEMP;<br> <br> GRANT CONNECT TO TESTUSER;<br> GRANT DBA TO TESTUSER; <br> <br> /* we will not bother about a surplus of "rights" to the "testuser" in this test database */<br> /* Now, lets see of our testuser can connect to the database: */<br> <br> SQL> connect testuser/testuser@sales<br> Connected.<br> <br> SQL> select name from v$database;<br> NAME<br> ---------<br> SALES<br> <br> <font face="arial" size=2 color="black"> So, we have a couple of tablespaces and a user which can connect, and can who create objects like tables and indexes.<br> <br> Let's connect as the testuser again, and create a few tables. Note that the EMPLOYEE1 and EMPLOYEE2 tables,<br> are created on different tablespaces.<br> <br> <font face="courier" size=2 color="black"> CREATE TABLE EMPLOYEE1<br> (<br> EMP_ID NUMBER(6) NOT NULL,<br> EMP_NAME VARCHAR2(20) NOT NULL,<br> SALARY NUMBER(7,2)<br> )<br> TABLESPACE SALES_DATA_1;<br> <br> CREATE TABLE EMPLOYEE2<br> (<br> EMP_ID NUMBER(6) NOT NULL,<br> EMP_NAME VARCHAR2(20) NOT NULL,<br> SALARY NUMBER(7,2)<br> )<br> TABLESPACE SALES_DATA_2;<br> <br> <br> /* Now, our testuser will insert 9999 records into the EMPLOYEE1 table:<br> <br> declare<br> i number := 1;<br> begin<br> while i<10000 loop<br> insert into EMPLOYEE1<br> values (i,'harry',2500);<br> <br> i := i + 1;<br> end loop;<br> commit;<br> end;<br> /<br> <br> /* Next, check if we have those records in the EMPLOYEE1 table:<br> <br> set linesize 1000<br> set pagesize 1000<br> <br> SELECT * FROM EMPLOYEE1<br> <br> /* This should return 9999 records. */<br> <br> <font face="arial" size=2 color="black"> We are almost ready with setting up our SALES database. The following statement will very likely NOT WORK<br> because your database has a "locally managed" system tablespace.<br> You might try the next statement, but on 10g/11g it will almost certainly not work, because per default the SYSTEM tablespace is locally managed.<br> If you get <I>"ORA-12913: Cannot create dictionary managed tablespace"</I>, there is nothing to worry about.<br> Only in a few cases, I will refer to a dictionary managed tablespace.<br> Anyway, I will create a 'third' data tablespace, which is "dictionary managed".<br> Later, we will explain the difference between a locally and dictionary managed tablespace.<br> <br> <font face="courier" size=2 color="black"> CREATE TABLESPACE SALES_DATA_3 DATAFILE 'C:\ORADATA\SALES\SALES_DATA_03.DBF' SIZE 50M<br> EXTENT MANAGEMENT DICTIONARY<br> default storage (<br> initial 512K<br> next 512K<br> minextents 1<br> pctincrease 0 )<br> minimum extent 512K<br> logging<br> online<br> permanent;<br> <br> /* The upper statement very likely fails in your situation. Don't worry about it. */<br> <br> <br> <font face="arial" size=2 color="black"> <br> <h3>1.2 Getting a Hex editor for Windows:</h3> Next, to emulate several crashes, we need a <I>hex editor</I>, by which we can open a database file,<br> and "corrupt" the file manually.<br> There are several choices here. You might consider the free utility XVI32, which you can download from<br> the site below.<br> <br> http://www.chmaas.handshake.de/delphi/freeware/xvi32/xvi32.htm<br> <br> You only have to download the zip file, and extract it. Then you are ready to run it.<br> There exists many alternatives too. Just "google" on "hex editor".<br> But I really recommend xvi32, due to its simplicity and capabilities.<br> <br> <br> <h2>Chapter 2. Examples of soft and hard block corruption.</h2> <h3>2.1. Soft block corruption.</h3> Our "testuser" inserted 9999 records into the EMPLOYEE1 table. This table is located in the "SALES_DATA_1" tablespace,<br> which corresponds to the file "c:\oradata\sales\sales_data_01.dbf".<br> Now, say that the first couple of blocks (each 8192 bytes) are reserved for system purpose, then it's fair to "gamble" that blockno 20<br> is allocated to the EMPLOYEE table. Since 20x8192=163840 in decimal, that would be 28000 in Hexidecimal.<br> <br> Note: just use Windows calculator, switch to "scientific mode", type 8x8192. When the result has appeared, click the "Hex" radio control.<br> <br> Let's "edit" that file, while the database is closed. Startup "sqlplus", logon as "sys" (as sysdba) and shutdown the instance:<br> <br> <font face="courier" size=2 color="black"> SQL> shutdown immediate<br> <br> <font face="arial" size=2 color="black"> After a short while, the database is closed, meaning that no Oracle processes is accessing any database file,<br> and all processes are terminated (the database files are "cold").<br> <br> - Startup xvi32.<br> - Open the file "c:\oradata\sales\sales_data_01.dbf".<br> - Navigate (scroll) to hex address 28000.<br> <br> I noticed (in my case) that on the 19th block (just "before" 28000), there is a "harry" listed.<br> Actually, this is the "last" harry in that block. Now, I navigated to that string,<br> and replaced "harry" with "HENRY".<br> This is an example of a "soft" corruption. Physically, the block is sound, but the checksum of the block has changed.<br> <br> <img src="oraclecorr1.jpg" align="center"/> <br> Save the file, and close xvi32.<br> <br> Let's start the instance, and open the SALES database. Start sqlplus (note that you probably should set the "ORACLE_SID=sales" variable first).<br> and startup the instance:<br> <br> <font face="courier" size=2 color="black"> C:\> set ORACLE_SID=sales<br> C:\> sqlplus /nolog<br> SQL> connect / as sysdba<br> connected<br> SQL> startup<br> ..<br> SQL> connect testuser/testuser<br> connected<br> SQL> SELECT * FROM EMPLOYEE1;<br> <br> ERROR:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> <font face="arial" size=2 color="black"> Ok, the block number is just what we expected it to be. Oracle discovered a checksum mismatch on that block.<br> Actually, the block is just physically sound, but the "administration" of that block is not right.<br> <br> Dit you note that Oracle didn't complained at all at startup, about the corrupt block?<br> This is so because it's just <B>an ordinary datablock</B>. We can't expect Oracle to check every block at boottime.<br> Only when a process tries to access the block, the error is generated.<br> Note: Ofcourse many "special" blocks <I>are</I> checked, such as the headers of the database files.<br> <br> <h3>2.2. Soft corruption of a Database block, and the DB_BLOCK_CHECKING init.ora parameter.</h3> If it get's clear that a table has a corrupt block, you might think of a specific init.ora parameter: DB_BLOCK_CHECKING.<br> As we will soon find out, it will not help us SELECTING a table with a soft corrupt block.<br> Anyway, let's test it.<br> <br> The parameter can take the values:<br> <br> DB_BLOCK_CHECKING = { FALSE | OFF | LOW | MEDIUM | TRUE | FULL }<br> <br> As you can see, you can put the parameter to FALSE. Then No block checking is performed for blocks in user tablespaces.<br> Although it is recommended to place it to "TRUE" for production systems, let's see what happens if we place it explicitly<br> to "FALSE" for the SALES database.<br> I assume that you know that the spfile.ora (or init.ora) contains a number of "settings", that the instance will read<br> at startup. You can edit the "init.ora" file accordingly, but most 10g, 11g systems use the binary "spfile.ora" variant,<br> which you can alter using "ALTER SYSTEM SET <configparameter> = VALUE SCOPE=BOTH|MEMORY|SPFILE"<br> <br> So here we go:<br> <br> <font face="courier" size=2 color="black"> SQL> alter system set DB_BLOCK_CHECKING=FALSE SCOPE=BOTH;<br> <br> System altered<br> <br> SQL> shutdown immediate /* lets do a restart of the instance anyway */ <br> ..<br> SQL> startup<br> ..<br> SQL> show parameter DB_BLOCK_CHECKING; /* lets check the current value */ <br> <br> NAME TYPE VALUE<br> ------------------------------------ ----------- --------------------<br> db_block_checking string FALSE<br> SQL><br> ..<br> SQL> SELECT * FROM TESTUSER.EMPLOYEE1;<br> <br> ERROR:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> <font face="arial" size=2 color="black"> So, placing this parameter to FALSE, does not help in selecting all rows from a Table, if a block is logically corrupt.<br> Actually DB_BLOCK_CHECKING main purpose is checking that the block in memory will be checked after it is written to disk,<br> which is a good thing.<br> But it's not an "easy escape" to be able to fully select a table while a block is corrupt.<br> <br> <br> <h3>2.3 Analyzing Tables and Files for corruption.</h3> Before we repair anything, let's explore a few "well known" Oracle sanity checks.<br> <br> <B>1. The "ANALYZE TABLE_NAME VALIDATE STRUCTURE" statement.</B><br> <br> Connect as sys (sysdba) and try the analyze command on the TESTUSER.EMPLOYEE1 table.<br> <br> <font face="courier" size=2 color="black"> SQL> connect / as sysdba<br> Connected.<br> <br> SQL> ANALYZE TABLE TESTUSER.EMPLOYEE1 VALIDATE STRUCTURE;<br> analyze table TESTUSER.EMPLOYEE1 VALIDATE STRUCTURE<br> *<br> ERROR at line 1:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> <font face="arial" size=2 color="black"> Here, the "analyze" command comes up with exactly the same errormessage. You see? It could be a very usable tool<br> to check your tables once in a while.<br> <br> <B>2. The "dbv" OS prompt utility .</B><br> <br> The ANALYZE TABLE statement you use while you are "in" Oracle, that is, you have a sqlplus session to the Database,<br> and you use that statement just as you would use any other SQL statement.<br> <br> A check from the Operating System is possible too. For this, Oracle has the "dbv" utility available.<br> It's very easy to use. Take a look at the following example:<br> <br> <font face="courier" size=2 color="black"> C:\oradata\sales>dbv file=SALES_DATA_01.DBF blocksize=8192<br> <br> DBVERIFY - Verification starting : FILE = SALES_DATA_01.DBF<br> Page 19 is marked corrupt<br> Corrupt block relative dba: 0x01400013 (file 5, block 19)<br> Bad check value found during dbv:<br> Data in bad block:<br> type: 6 format: 2 rdba: 0x01400013<br> last change scn: 0x0000.000c9749 seq: 0x1 flg: 0x06<br> spare1: 0x0 spare2: 0x0 spare3: 0x0<br> consistency value in tail: 0x97490601<br> check value in block header: 0xe3b6<br> computed block checksum: 0x43c<br> <br> DBVERIFY - Verification complete<br> <br> Total Pages Examined : 6400<br> Total Pages Processed (Data) : 27<br> Total Pages Failing (Data) : 0<br> Total Pages Processed (Index): 0<br> Total Pages Failing (Index): 0<br> Total Pages Processed (Other): 12<br> Total Pages Processed (Seg) : 0<br> Total Pages Failing (Seg) : 0<br> Total Pages Empty : 6360<br> Total Pages Marked Corrupt : 1<br> Total Pages Influx : 0<br> Highest block SCN : 825161 (0.825161)<br> <br> <font face="arial" size=2 color="black"> <B>3. Some v$ views</B><br> <br> The above commands (ANALYZE TABLE and the dbv utility) are tools by which you can check objects <I>actively</I>. You issue the Statement or command, and the check is done right now.<br> <br> Some "v$" views will register corruption as well, like for example "V$DATABASE_BLOCK_CORRUPTION".<br> The only problem often seen here, is that those views will only receive input <I>if some Oracle component</I><br> have found the corruption, and not by you (or a userprocess) issueing some statement.<br> For example, if you would make RMAN backups, and RMAN encounters a corrupt block, then it will be "registered",<br> and from then on only, it will be visible in some of those v$ views.<br> <br> <br> In this section, we have seen how to soft corrupt a database block, and we have seen the infamouse Oracle error<br> message "ORA-01578: ORACLE data block corrupted (file # n, block # m)"<br> Also we have seen two Oracle methods that enables us to check tables and files on corruption.<br> <br> <h3>2.4 Hard corruption of a Database block.</h3> This time, we will truly destroy a few blocks, that is, fill a region with "all zero's".<br> We will start somewhere in the middle of a block, then cross a block boundary, and continue<br> to write zero's all up to somewhere in the middle of that next block.<br> <br> First, we will start with a "fresh" situation. That is, we will drop the testuser (and all of his objects),<br> and then we will drop the tablespaces "SALES_DATA_1" and "SALES_DATA_1".<br> Then we will create those tablespaces again, create the testuser again, create the tables again, and we will<br> use the script that inserts 9999 into the EMPLOYEE1 table.<br> It's not much work, since we have all the neccessary statements in Chapter 1.<br> <br> Now you might argue, that dropping the user with "DROP USER TESTUSER CASCADE" would be enough.<br> Indeed, DROP USER... CASCADE should drop the user, with all it's objects included.<br> Well, not exactly. This will be explained later.<br> So, please do the following:<br> <br> - from the "C:\>" prompt, set the ORACLE_SID=sales (or whatever name your database uses)<br> - Start sqlplus /nolog<br> - connect / as sysdba (or use "connect sys/password@sales as sysdba")<br> then use statements as show below:<br> <br> <font face="courier" size=2 color="black"> SQL> connect / as sysdba<br> Connected.<br> SQL> DROP USER TESTUSER CASCADE;<br> <br> User dropped.<br> <br> SQL> DROP TABLESPACE SALES_DATA_1 INCLUDING CONTENTS;<br> <br> Tablespace dropped.<br> <br> SQL> DROP TABLESPACE SALES_DATA_2 INCLUDING CONTENTS;<br> <br> Tablespace dropped.<br> <br> Next, delete the physical files from the system. For that, use explorer or use the cmd prompt:<br> <br> C:\oradata\sales> del sales_data_* <br> <br> <font face="arial" size=2 color="black"> Now, using the statements of chapter 1:<br> - create those tablespaces (SALES_DATA_1 and SALES_DATA_2) again,<br> - create the testuser again,<br> - create the tables again (as user TESTUSER), and insert 9999 records into the EMPLOYEE1 again.<br> <br> This is a fresh start !<br> <br> If you executed the above, again it's (oviously) "fair" to "assume" that blockno 20<br> is allocated to the EMPLOYEE1 table.<br> Again, since 20x8192=163840 in decimal, that would be 28000 in Hexidecimal.<br> <br> Now, let's shutdown the database, start xvi32, and open the "C:\ORADATA\SALES\SALES_DATA_01.DBF" databasefile.<br> <br> Scroll down to hex address 27EEE (the 19th block), start entering "zero's", all the way to for example<br> hex address 28176. Here, we have destroyed the end of the 19th block, crossing the boundary,and destroyed<br> the header of the 20th block. This is hard corruption.<br> <br> Save the file, and close xvi32.<br> <br> Let's start the instance, and open the SALES database. Start sqlplus and startup the instance:<br> <br> <font face="courier" size=2 color="black"> C:\> set ORACLE_SID=sales<br> C:\> sqlplus /nolog<br> SQL> connect / as sysdba<br> connected<br> SQL> startup<br> ..<br> SQL> connect testuser/testuser<br> connected<br> SQL> SELECT * FROM EMPLOYEE1;<br> <br> ERROR:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> <font face="arial" size=2 color="black"> Just as before, at the moment Oracle accesses the first corrupt block, it pulls out, throwing the same<br> familiar message as in section 2.1.<br> Note that from the message alone, you cannot determine whether a block is soft- or hardcorrupted.<br> <br> Ofcourse, we can use the "ANALYZE TABLE" SQL Statement again, and we can try the "dbv" utility.<br> Let's try the latter one. Exit sqlplus and open a command cmd window.<br> Navigate to the C:\ORADATA\SALES directory. Enter the familiar dbv command:<br> <br> <font face="courier" size=2 color="black"> C:\oradata\sales> dbv file=SALES_DATA_01.DBF blocksize=8192<br> <br> DBVERIFY - Verification starting : FILE = SALES_DATA_01.DBF<br> <B>Page 19 is influx - most likely media corrupt</B><br> Corrupt block relative dba: 0x01400013 (file 5, block 19)<br> Fractured block found during dbv:<br> Data in bad block:<br> type: 6 format: 2 rdba: 0x01400013<br> last change scn: 0x0000.000cb030 seq: 0x1 flg: 0x06<br> spare1: 0x0 spare2: 0x0 spare3: 0x0<br> consistency value in tail: 0x00000000<br> check value in block header: 0xefa5<br> computed block checksum: 0x9678<br> <br> <B>Page 20 is influx - most likely media corrupt</B><br> Corrupt block relative dba: 0x01400014 (file 5, block 20)<br> Fractured block found during dbv:<br> Data in bad block:<br> type: 0 format: 0 rdba: 0x00000000<br> last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00<br> spare1: 0x0 spare2: 0x0 spare3: 0x0<br> consistency value in tail: 0xb0300601<br> check value in block header: 0x0<br> block checksum disabled<br> <br> DBVERIFY - Verification complete<br> <br> Total Pages Examined : 6400<br> .. (some records omitted)<br> Total Pages Marked Corrupt : 2<br> Total Pages Influx : 2<br> Highest block SCN : 831536 (0.831536)<br> <br> <font face="arial" size=2 color="black"> <br> This time, the <B>dbv</B> utility informs you about the "hard" corruption. Note the "<B>Page 19 is influx - most likely media corrupt</B>" message.<br> A similar message is shown for block 20.<br> This is very different from the dbv output as shown in section 2.1 (the soft corrupted block).<br> <br> <br> <h2>Chapter 3. Repairing objects with corrupt blocks.</h2> <h3>3.1. Repairing tables with the DBMS_REPAIR.SKIP_CORRUPT_BLOCKS() procedure.</h2> This method, using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS(), can be regarded as "the quick fix".<br> <br> The DBMS_REPAIR package, contains a number of subprograms that enable you to <B>detect</B>, or <B>skip</B>, or <B>repair</B> corrupt blocks<br> in tables and indexes.<br> At many occasions, you probably use the "skip" functionality, after you have seen that repair is not the "repair"<br> you probably hoped it to be.<br> <br> Why whould a repair not work? <br> <br> - In section 2.1, we have seen an example of a "soft" corruption. Here, we replaced a characterstring (harry)<br> with another characterstring (HENRY) in the same place in that block. It only lead to a different<br> checksum of that block. Such soft corruptions are indeed (in principle) repairable, and thus the block(s) might be recovered.<br> <br> - In section 2.4, I hard corrupted a few blocks, by placing all zero's in the footer of one block, crossing over<br> to the next block, and destroying that block's header and most of it's data.<br> That is not repairable, because the information is truly lost.<br> Only an RMAN block restore could help us here, or possibly one of Oracle's fabulous FLASHBACK options.<br> <br> We will study two procedures from the package:<br> <br> (1): DBMS_REPAIR.SKIP_CORRUPT_BLOCKS() <br> (2): DBMS_REPAIR.FIX_CORRUPT_BLOCKS() <br> <br> Personally, I favour the "SKIP" procedure over the "FIX" procedure.<br> In case of "hard" corruption or "soft" corruption, where one block, or a few blocks, are affected, you can ensure<br> that Oracle <B>just skips them (!)</B>, so that the table can be properly accessed, and that will allow you to copy the data<br> over to a new copy of that table.<br> In "real life" circumstances, actually this is a very nice solution, that <B>indeed</B> will work.<br> <br> Remember, as you have seen in Chapter 2, if you query a table with a corrupt block, <B>it get's interrupted</B> when the<br> corrupt block is accessed. So, it's difficult to make a full copy of the data of all "good" blocks.<br> <br> If only one or two blocks are inaccessible, usually this represents just a tiny bit of data, compared to <B>all</B> data<br> that's stored in that table.<br> That's why the "skip corrupted" procedure is so valuable, in case of corrupted blocks. This one works <I>really</I> works "quickly".<br> <br> Let's try it.<br> <br> This procedure enables or disables the skipping of corrupt blocks during index and table scans of the specified object.<br> In most occasions, you want Oracle "to skip" corrupt blocks, so that queries can proceed without interruptions.<br> In this section, we are going to "enable" the skipping of bad blocks. <br> Thus, what we are going to do here, is telling Oracle to skip over corrupt block, which will enable you to create<br> a "good" table, reading all good blocks, while ONLY skipping the bad block(s).<br> <br> The full syntax of the function is shown below. It takes 4 input vars, of which the last 2 can be left as the default.<br> <br> exec dbms_repair.skip_corrupt_blocks(SCHEMA_NAME, OBJECT_NAME, OBJECT_TYPE, SKIP_FLAG);<br> <br> Here, the SCHEMA_NAME can be read as the "owner" of the object (like testuser, harry etc..)<br> Formally, a SCHEMA is not equal to what is understood to be an "owner", but it usually is.<br> The OBJECT_NAME, most often is the name of a table, or an index.<br> The OBJECT_TYPE, most often is "table" or "index". The default is "table".<br> The SKIP_FLAG tells Oracle to skip or not to skip, corrupt tables. The default is "skip".<br> <br> Remember what happened in section 2.2, where we have the situation that two blocks is the TESTUSER.EMPLOYEE1 table, are hard corrupted?<br> When one would query that table, this happened:<br> <br> <font face="courier" size=2 color="black"> C:\> set ORACLE_SID=sales<br> C:\> sqlplus /nolog<br> SQL> connect / as sysdba<br> connected<br> SQL> startup<br> ..<br> SQL> SELECT * FROM TESTUSER.EMPLOYEE1;<br> <br> ERROR:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> <font face="arial" size=2 color="black"> Let's "fix" the table, using "dbms_repair.skip_corrupt_blocks":<br> <br> <font face="courier" size=2 color="black"> SQL> connect / as sysdba<br> connected<br> <br> SQL> exec dbms_repair.skip_corrupt_blocks('TESTUSER', 'EMPLOYEE1');<br> <br> PL/SQL procedure successfully completed.<br> <br> SQL> SELECT * FROM TESTUSER.EMPLOYEE1;<br> <br> <I>All rows from all good blocks are shown.</I><br> <br> <font face="arial" size=2 color="black"> Don't forget the quotes around the SCHEMA_NAME and OBJECT_NAME (thus: 'TESTUSER', 'EMPLOYEE1') <br> The above action has made the EMPLOYEE1 table accesible again !<br> <br> Ofcourse, it is suggested that you now create a copy table with all good data (which usually will be "the bulk" of the data).<br> So, in general, that would be using a Statement like:<br> <br> <font face="courier" size=2 color="black"> CREATE TABLE salvage_table<br> AS SELECT * FROM corrupt_table;<br> <br> <font face="arial" size=2 color="black"> Now you have saved the "bulk" of the tablerows, because usually there is only one (or at most a couple) bad blocks.<br> So, making the table "alive" again, and copying the good rows to another table, is quite a reasonable solution.<br> Ofcourse, you will miss a couple of rows (those that resided on the bad blocks).<br> <br> <B>CONCLUSION: <br> <br> DBMS_REPAIR.SKIP_CORRUPT_BLOCKS(SCHEMA_NAME,TABLE_NAME)<br> <br> is a very reasonable method to access a table with corrupt blocks, and copy the content to a new table.</B><br> <br> <li>But remember, this procedure is <B>best to be used</B> if you want to copy all "good" rows from the "bad" table, to a <B>new</B> table.</li><br> <li>If you want <B>to keep using the current table</B>, it's better to use the method of section 3.2.</li><br> <br> <I>Note: If Oracle is skipping corrupt block(s), you are also able to export (using 'exp' or 'expdp') the table to a file.</I><br> <br> <br> <h3> 3.2 The DBMS_REPAIR.FIX_CORRUPT_BLOCKS() procedure:</h3> If you are already content with the method of section 3.1, because it's a method to save the good rows of a table,<br> you may skip this section, and go on to section 3.3.<br> <br> In case of block corruption, we might consider using the DBMS_REPAIR.FIX_CORRUPT_BLOCK() procedure. Note the expression "FIX" in<br> the procedure name.<br> I am afraid that it will be a bit of a dissapointment, because you might have thought that it could repair soft corruption.<br> In many circumstances, the result is that the procedure will "mark" or "flag" soft or hard corrupted block(s),<br> and then consider it to be "repaired".<br> <br> In case of a hard corrupted block (as in section 2.2), data is truly gone (like we did in writing "all zero's").<br> So, in all hostety, we can't expect Oracle to repair that (that is, get the rows back that were on that block).<br> <br> In case of a soft corrupted block (as in section 2.1), in some circumstances, it would be <B>reasonable</B> to expect, that the block<br> can be repaired. For example, if only a checksum on a block is not right, it should be repairable.<br> Using the Oracle DBMS_REPAIR procedures allone, you won't <I>really</I> repair the block.<br> But, in conjunction with the DUMP_ORPHAN_KEYS() and ADMIN_TABLES() procedures, you might get usefull info<br> about indexes that are affected, and freelists which should be rebuild. This might help in the overall process.<br> The idea then is, that you want to <I>continue</I> using that object, and thus additional procedures can be used<br> which helps you in realizing that (providing you info about indexes, freelist etc..)<br> <br> But again, with respect to purely salvaging table data, you won't be better off compared to the method of section 3.1.<br> <br> The basic idea is this:<br> <br> The process requires two administrative tables to hold a list of <I>corrupt block(s)</I> and <I>index keys</I> pointing<br> to those blocks. These are created as follows, using the DBMS_REPAIR.ADMIN_TABLES() sub procedure :<br> <br> <font face="courier" size=2 color="black"> BEGIN<br> <B>Dbms_Repair.Admin_Tables</B> (<br> table_name => 'REPAIR_TABLE',<br> table_type => Dbms_Repair.Repair_Table,<br> action => Dbms_Repair.Create_Action,<br> tablespace => 'SALES_DATA_1'); <br> <br> <B>Dbms_Repair.Admin_Tables</B> (<br> table_name => 'ORPHAN_KEY_TABLE',<br> table_type => Dbms_Repair.Orphan_Table,<br> action => Dbms_Repair.Create_Action,<br> tablespace => 'SALES_DATA_1');<br> END;<br> /<br> <br> <font face="arial" size=2 color="black"> The above statements only builds the admin tables.<br> <br> Now that the administration tables are built, we are able to check the table using the <B>DBMS_REPAIR.CHECK_OBJECT()</B> procedure.<br> This procedure behaves quite like the "ANALYZE TABLE" command, to parse through the whole table, checking each block.<br> <br> <font face="courier" size=2 color="black"> SET SERVEROUTPUT ON /* needed for more output on your screen */ <br> <br> DECLARE <br> no_corrupt INT;<br> BEGIN<br> no_corrupt := 0;<br> <B>Dbms_Repair.Check_Object</B> (<br> schema_name => 'TESTUSER',<br> object_name => 'EMPLOYEE1',<br> repair_table_name => 'REPAIR_TABLE',<br> corrupt_count => no_corrupt);<br> Dbms_Output.Put_Line('number blocks corrupt: ' || TO_CHAR (no_corrupt));<br> END;<br> <br> <font face="arial" size=2 color="black"> <br> After running the above statement, the currupt block(s) have been detected, but are not marked as corrupt yet.<br> Next, the <B>DBMS_REPAIR.FIX_CORRUPT_BLOCKS()</B> procedure can be used to mark the blocks as corrupt, allowing them<br> to be skipped when queries are run.<br> <br> <font face="courier" size=2 color="black"> SET SERVEROUTPUT ON<br> DECLARE<br> v_fixed INT;<br> BEGIN <br> v_fixed := 0;<br> <B>Dbms_Repair.Fix_Corrupt_Blocks</B> (<br> schema_name => 'TESTUSER',<br> object_name=> 'EMPLOYEE1',<br> object_type => Dbms_Repair.Table_Object,<br> repair_table_name => 'REPAIR_TABLE',<br> fix_count=> v_fixed);<br> Dbms_Output.Put_Line('number of blocks fixed: ' || to_char(v_fixed));<br> END;<br> /<br> <br> <font face="arial" size=2 color="black"> At this phase, corrupt table blocks have been located, and all are "marked". Now, if you would know that<br> the table has indexes, you can check all index entries, searching for "orphaned keys", meaning <I>those</I> index entries<br> that were pointing to the lost rows on the bad table blocks.<br> So, this can be done using the <B>DBMS_REPAIR.DUMP_ORPHAN_KEYS()</B> procedure.<br> Now, suppose the EMPLOYEE1 table has an index called "INDEX_EMP_NAME", you could run a statement as follows:<br> <br> <font face="courier" size=2 color="black"> SET SERVEROUTPUT ON<br> DECLARE<br> v_number_orphans INT;<br> BEGIN<br> v_number_orphans := 0;<br> <B>Dbms_Repair.Dump_Orphan_Keys</B> (<br> schema_name => 'TESTUSER',<br> object_name => 'INDEX_EMP_NAME',<br> object_type => Dbms_Repair.Index_Object,<br> repair_table_name => 'REPAIR_TABLE',<br> orphan_table_name=> 'ORPHAN_KEY_TABLE',<br> key_count => v_number_orphans);<br> Dbms_Output.Put_Line('orphan key count: ' || to_char(v_number_orphans));<br> END;<br> /<br> <br> <font face="arial" size=2 color="black"> Ofcourse, since we have the Admin tables "REPAIR_TABLE" and "ORPHAN_KEY_TABLE", it is very instructive<br> to view their contents.<br> <h3>Excercise:</h3> Now, here is an exercise: you might try the above described procedure yourself !<br> <br> Again, you should start with a "fresh" situtation. Then you will create (again) a soft corruption in the 19th block<br> in the EMPLOYEE1 table, exactly as is described in section 2.1.<br> <br> <font face="courier" size=2 color="black"> SQL> connect / as sysdba<br> Connected.<br> SQL> DROP USER TESTUSER CASCADE;<br> <br> User dropped.<br> <br> SQL> DROP TABLESPACE SALES_DATA_1 INCLUDING CONTENTS;<br> <br> Tablespace dropped.<br> <br> SQL> DROP TABLESPACE SALES_DATA_2 INCLUDING CONTENTS;<br> <br> Tablespace dropped.<br> <br> Next, delete the physical files from the system. For that, use explorer or use the cmd prompt:<br> <br> C:\oradata\sales> del sales_data_* <br> <br> <font face="arial" size=2 color="black"> Now, using the statements of chapter 1:<br> - create those tablespaces (SALES_DATA_1 and SALES_DATA_2) again,<br> - create the testuser again,<br> - create the tables again (as user TESTUSER), and insert 9999 records into the EMPLOYEE1 again.<br> - Also create an index on the EMPLOYEE1 table called "INDEX_EMP_NAME", using:<br> CREATE INDEX INDEX_EMP_NAME ON EMPLOYEE1(EMP_NAME) TABLESPACE SALES_INDEX_1.<br> <br> Then, connect as sys (sysdba) and shutdown the database. Now, as described in section 2.1, start xvi32,<br> open the "c:\oradata\sales\sales_data_01.dbf" file, navigate to the 19th block (just "before" hex 28000),<br> and change one of the "last" harry's into HENRY. Then save the file and close xvi32.<br> Next, startup the database.<br> <br> Now, if you want, execute the various DBMS_REPAIR subprocedures, as described above, yourself <br> <br> <br> <h3>3.3 The ORA-01578 and ORA-08103 errors.</h2> <B><U>ORA-01578:</B></U><br> <br> Block corruption really will not occur "often". But when it <I>does</I> happen, then you will see<br> the well-known ORA-01578 (or ORA-1578) error message.<br> In all the examples above, we indeed have seen this particular message.<br> <br> <B><U>ORA-08103:</B></U><br> <br> Usually, the message is not so shocking, because it merely means this: <I>ORA-08103: object no longer exists.</I><br> <br> In general, we can distinguish three cases:<br> <br> (1):<br> An object was indeed missing. Such an event could for example happen if some process tries to access the object,<br> while another process for example dropped or moved it.<br> It does not have to be interactive processes. It could be batch processes where one process want to perform<br> a drop and re-create (of a table, or index etc..), while another process is not 'aware' of that fact.<br> So, in this case, the message is just stating reality: the object does not exist (for some process, at some time),<br> and thus is not alarming (except for the fact that the programming logic is not right).<br> <br> (2):<br> Over several Oracle versions, a few bugs raised this message too, effectively generating a "false" alarm.<br> For example:<br> - PQ may signal a false ORA-8103 message under a high load.<br> - ORA-8103/ORA-1410 from concurrent INSERT / export on ASSM tables<br> <br> (3):<br> Not withstanding the above facts, we should always take this message very seriously.<br> If you see it (for example, a user reports it, or you find it in a log), and you know for sure that the object exists,<br> then you might have run into a "bug". Or, indeed some form of logical corruption is bothering your system.<br> <br> If you see the message in production, and no obvious reason can be found (like that some batch indeed re-creates the object,<br> at the time that another process tries to access it), and you cannot find any filed bug report, I think it needs immediate attention.<br> <br> I consider this message more serious than the ordinary "block-corrupt" errormessage (which is ORA-01578).<br> That is, <I>if you cannot explain it.</I><br> For production systems, I think you should immediately contact Oracle Support.<br> <br> As for other options:<br> You could try to re-create the object. I have seen myself that a " SELECT * " sometimes still works, while other operations fail.<br> <br> <br> <h3>3.4 Using the SET EVENTs parameter.</h2> Again, we return to the usual corrupt block message, which is ORA-01578.<br> There is another "quick fix", which might be important in some cases.<br> <br> Oracle provides many diagnostic tools for debugging the RDBMS. One "traditional" methods is, that certain tracing options,<br> or a certain type of "behaviour" of the Instance, can be turned on using the SET EVENTS statement. <br> In many cases this will result that "diagnostic information" (focused on a certain event) gets dumped to trace files.<br> Or, some other "SET EVENTS" commands, will just change the behaviour of the RDBMS in a certain way.<br> <br> There are several ways to put an "event" active, for example:<br> <br> <ol> <li>Using the ALTER SESSION SET EVENTS 'eventsnumber' statement. This will effect your current session only.</li><br> <li>Using the ALTER SYSTEM SET EVENTS 'eventsnumber' statement. This is a change "Instance" wide .</li><br> <li>Editing the init.ora startup file (if you do not use a spfile.ora). This is a change "Instance" wide .</li><br> </ol> <br> Now, the "behaviour" we are searching for, is that Oracle will skip "corrupt" blocks, and that it <br> allows you select the whole table again (except for the corrupt block).<br> <br> In a way, this method is not much different from what we have seen in section 3.1,<br> using the DBMS_REPAIR.SKIP_CORRUPT_BLOCKS procedure.<br> <br> If it works, you again have a method to SELECT all of the table again, and just skipping corrupt blocks.<br> This means that you can create a new table from the damaged one, just as was shown in section 3.1.<br> <br> Let's try it.<br> <br> First, let's start again with a "fresh" situation. That is, drop the testuser (and all of his objects),<br> and then drop the tablespaces "SALES_DATA_1" and "SALES_DATA_1".<br> Then create those tablespaces again, create the testuser again, create the tables again, and use the script<br> that inserts 9999 into the EMPLOYEE1 table.<br> It's not much work, since we have all the neccessary statements in Chapter 1.<br> Then use XVI32 to soft corrupt the 19th block (change a "harry" into "HENRY"), exactly as described in section 2.1<br> <br> If you have done all of the above, logon as "testuser", and try the following:<br> <br> <font face="courier" size=2 color="black"> SQL> SELECT * FROM EMPLOYEE1;<br> <br> (many records skipped...)<br> <br> ..<br> 2415 harry 2500<br> 2416 harry 2500<br> 2417 harry 2500<br> <br> ERROR:<br> ORA-01578: ORACLE data block corrupted (file # 5, block # 19)<br> ORA-01110: data file 5: 'C:\ORADATA\SALES\SALES_DATA_01.DBF'<br> <br> 2820 rows selected.<br> <br> <font face="arial" size=2 color="black"> Ofcourse, we face the dreaded ORA-01578 error again.<br> Next, try the following ALTER SESSION SET EVENTS command:<br> <br> <font face="courier" size=2 color="black"> SQL> ALTER SESSION SET EVENTS '10231 TRACE NAME CONTEXT FOREVER, LEVEL 10';<br> <br> SQL> SELECT * FROM EMPLOYEE1;<br> <br> (many records skipped...)<br> 9997 harry 2500<br> 9998 harry 2500<br> 9999 harry 2500<br> <br> 9596 rows selected.<br> <br> <font face="arial" size=2 color="black"> This time, <B>no error is produced</B>, and your query just continues and is not halted.<br> You are now able to select the whole table again, allowing you to salvage all 'good' rows to a new table.<br> <br> Note: If Oracle is one of your main focus points, and suppose you are not too familiar with the SET EVENTS statement,<br> you really should spend one of these rainy day's on digging further into this very usefull diagnostic tool.<br> <br> <br> <h3>3.5 Querying "around" bad blocks.</h2> In this method, you just use SQL queries to obtain all good rows from the object, and where you are only skipping<br> the bad block(s).<br> It's not garanteed that it will work in <I>all</I> cases, but in many it will.<br> The following excellent article can be viewed as a true "classic" on the subject, and it will explain you<br> all you need to know.<br> <br> <font face="arial" size=2 color="blue"> http://web.inter.nl.net/users/T.Koppelaars/ora01578.rtf<br> <font face="arial" size=2 color="black"> <br> The technique described in the article, rely on rowid's and some other sophisticated insights.<br> <br> I have nothing to add to content of the above article. Maybe I can spend just a few words on a special case.<br> This special case is illustrated by the following example. Here we try <I>to avoid to access the table at all,</I><br> and we want Oracle to ONLY use the <I>"covering index"</I>, in a very simple context.<br> <br> You know that the EMPLOYEE1 table has the following DDL:<br> <br> <font face="courier" size=2 color="black"> CREATE TABLE EMPLOYEE1<br> (<br> EMP_ID NUMBER(6) NOT NULL,<br> EMP_NAME VARCHAR2(20) NOT NULL,<br> SALARY NUMBER(7,2)<br> )<br> TABLESPACE SALES_DATA_1;<br> <br> <font face="arial" size=2 color="black"> Now, suppose we have an index on that table like this one:<br> <br> <font face="courier" size=2 color="black"> CREATE INDEX INDEX_EMP ON EMPLOYEE1(EMP_NAME, SALARY) TABLESPACE SALES_INDEX_1.<br> <br> <font face="arial" size=2 color="black"> In a sense, you may view an index as a sort of "mini" table of the real table object.<br> The chosen key(s) of the index, corresponds to the same columns of the table.<br> I agree that this example is a <I>very opportunistic</I> example, because the index here has almost all tablecolumns<br> except for the "EMP_ID" column.<br> But I only want to convey the basic idea here.<br> <br> Now, its just a fact, that if you <B>only</B> query on a selection of the table columns, which are in the index as well,<br> the chances are pretty good that Oracle will only access the index, and not accessing the table <B>at all</B>.<br> <br> So, that means that you can retrieve all data from the table, which is covered by the index.<br> This includes that data (from the index), which corresponds to the data (from the table) that is on the corrupt block(s).<br> Again, and it's very important to stress that fact, it's only that data (only those columns) which<br> is included in the index.<br> <br> The theory is somewhat illustrated by the figure below:<br> <br> <img src="oraclecorr3.jpg" align="center"/> <br> And you know? You can even ask Oracle "nicely" to <B>use</B> the index, by using a <B>"Query Hint".</B><br> But even without the "hint", if you only query on colums which are all covered by the index, Oracle will<br> almost for sure use that index without accessing the table at all.<br> <br> I agree that the example above certainly has it's flaws. But I hope the general idea is clear.<br> <br> Excercise: try to make the above scenario work, and see if you indeed are able to retreive <U>all</U> EMP_NAME and SALARY<br> values, using the index alone.<br> <br> <br> <h3>3.6 Some remarks on the structure of a block.</h2> You probably know that "segments", like tables and indexes, are build from database "blocks" (or "pages').<br> Usually, a DBA selects a suitable blocksize voor the entire database, or for individual tablespaces.<br> Typical blocksizes are 4K, 8K, 32K etc..<br> <br> Oracle's block structure is quite complex. In such block, there are many specific "fields",<br> each with a very specific purpose to the database engine.<br> Below is a very schematic, and very simplified, representation of a Data Block:<br> <br> <br> <img src="oraclecorr5.jpg" align="center"/> <br> <br> We can see a "Header" part, with a number of fixed fields, like the "DBA" (database block address), or the "blocktype".<br> The blocktype identifies what sort of block we are dealing with.<br> Ofcourse, most blocks in a database file are "data blocks" (type 06), for tables and indexes. But many "administrative" blocks<br> exists as well, like the "segment header", which is a very special block for each table or index.<br> <br> Another interresting field is the RDBA, or the "relative database block address". This is an 4 byte address which most tools,<br> or dump commands, will show in hexadecimal format. Fortunately, some "tools" or dump commands will also use a more understandable<br> notation, which is like so: (file_id, blocknumber). So, for example (7,137) would then mean block number 137 in file number 7.<br> <br> Next to the header, a "transactional layer" is present. Since multiple processes can access a block, it is for example<br> needed to store transaction id's. Next, the "Table Directory" contains information about the table object,<br> which has rows stored in this particular block. Next, the "Row Directory" contains information about actual rows in the block,<br> including addresses for each row piece. Following that, is a certain amount of "free space" which can be used for new rows.<br> Then at last, we have space for the actual data, that is, the rows of the table or index object.<br> <br> Since we have "xvi32" as our Hex editor, let's take a look at how Block 20 in "c:\oradata\sales\sales_data_01.dbf"<br> looks like. Later we will see more tools and options to view contents of blocks, but let's stick to xvi32 for now.<br> <br> Make sure the instance is closed. Next, start xvi32 and open "c:\oradata\sales\sales_data_01.dbf". <br> Now, navigate to hex address 28000. Here we will see the "start" of block 20.<br> <br> <img src="oraclecorr7.jpg" align="center"/> <br> <br> Do you notice that the block starts with "06", meaning that this is just a data block. Next, the format seems to be "A2".<br> then we see the "00 00" of the unused "spare" fields. After that, the block's DBA follows etc... etc...<br> So, in my database, all data blocks start with "06 A2 00 00"<br> <br> This way, you could walk through the header, or even the entire block. This not what we will do right now, since we need<br> to see first, what other tools exist to view the contents of database blocks.<br> <br> <br> <h3>3.7 Using tools and commands to dump and view blocks.</h2> <B>3.7.1 A few words on the "ALTER SYSTEM DUMP DATAFILE filenumber BLOCK blocknumber" statement:</B><br> <br> From a sqlplus session, you can use the following "ALTER SYSTEM.." statement to dump the contents of a block to a tracefile.<br> This tracefile will be created in the USER_DUMP_DIRECTORY.<br> <br> <font face="courier" size=2 color="black"> ALTER SYSTEM DUMP DATAFILE filenumber BLOCK blocknumber;<br> <br> or:<br> <br> ALTER SYSTEM DUMP DATAFILE filenumber BLOCK min first block number BLOCK max last block number;<br> <br> <font face="arial" size=2 color="black"> So, using a connection to our SALES database, we could for example try this command:<br> <br> <font face="courier" size=2 color="black"> SQL> ALTER SYSTEM DUMP DATAFILE 5 BLOCK 17;<br> <br> <font face="arial" size=2 color="black"> The choice of numbers shown above, is due to the fact that we know that the TESTUSER.EMPLOYEE table lives in the<br> first few dozens of blocks in the "c:\oradata\sales\sales_data_01.dbf" file, which in my case, is datafile number "5".<br> <br> <font face="courier" size=2 color="blue"> <B>Note 1: how to lookup the file id's?</B><br> <br> You probably know how to lookup the file numbers of the datafiles. Anyway, I will list one methode here:<br> In my SALES database, we can see this:<br> <br> SQL> set linesize 1000<br> SQL> set pagesize 1000<br> <br> SQL> SELECT FILE#, NAME FROM v$datafile;<br> <br> FILE# NAME<br> 1 C:\ORADATA\TEST10G\SYSTEM01.DBF<br> 2 C:\ORADATA\TEST10G\UNDOTBS01.DBF<br> 3 C:\ORADATA\TEST10G\SYSAUX01.DBF<br> 4 C:\ORADATA\TEST10G\USERS01.DBF<br> 5 C:\ORADATA\SALES\SALES_DATA_01.DBF<br> 6 C:\ORADATA\SALES\SALES_DATA_02.DBF<br> 7 C:\ORADATA\SALES\SALES_INDEX_01.DBF<br> <br> By the way, you may also use the full path of the datafile in the "dump" command, like for example:<br> <br> SQL> ALTER SYSTEM DUMP DATAFILE 'C:\ORADATA\SALES\SALES_DATA_01.DBF' BLOCK 17;<br> <br> <B>Note 2: Where are the trace files stored?</B><br> <br> "Traditionally" (in Oracle up to 10g included), the init.ora/spfile.ora parameter "USER_DUMP_DEST",<br> specifies some location on our filesystem where Oracle will "dump" tracefiles from userprocesses.<br> You can find that location using:<br> <br> select value from v$parameter where name = 'user_dump_dest';<br> <br> This will show you the "USER_DUMP_DEST" directory. For example, on my 10g installation on Windows, it's:<br> <br> C:\oracle\product\10.2\admin\sales\udump<br> <br> Sometimes there can be a lot of ".trc" files in that place. Ofcourse, the timestamp of the files will tell<br> you what is your trace, but you also use this query to find the "process id" which created your file.<br> Then, just note the trace file which name carries that number, and your dump should be easy to identify.<br> <br> select spid from v$session s, v$process p<br> where p.addr = s.paddr and s.audsid = sys_context('userenv','sessionid')<br> <br> <br> <font face="arial" size=2 color="black"> Let's return to dumping a block to a trace file. So, what can we expect to see?<br> Above, we already dumped block 17 in database file 5, to a trace file in the "USER_DUMP_DEST" directory.<br> Such files are just ascii txt files which you can view using your favorite txt editor.<br> Here is some partial output of my trace:<br> <font face="courier" size=2 color="black"> <br> 73E6600 0000A206 01400011 000F6019 06010000 [......@..`......]<br> 73E6610 0000DF10 00000001 0000BBAE 000F5FFC [............._..]<br> 73E6620 00000000 01320002 01400009 00190003 [......2...@.....]<br> 73E6630 000001F1 008001DE 0067006A 00002193 [........j.g..!..]<br> 73E6640 000F6019 00000000 00000000 00000000 [.`..............]<br> <br> .. many rows skipped..<br> <br> 73E6CD0 0303012C 052E35C2 72726168 1AC20279 [,....5..harry...]<br> 73E6CE0 0303012C 052D35C2 72726168 1AC20279 [,....5-.harry...]<br> 73E6CF0 0303012C 052C35C2 72726168 1AC20279 [,....5,.harry...]<br> 73E6D00 0303012C 052B35C2 72726168 1AC20279 [,....5+.harry...]<br> 73E6D10 0303012C 052A35C2 72726168 1AC20279 [,....5*.harry...]<br> .. etc<br> <br> <font face="arial" size=2 color="black"> You are probably no so very much impressed by what you can "see" in this output.<br> There is ofcourse much more to say on how you can best interpret those trace files.<br> <br> Now, just to make you curious:<br> <br> <B>-- Other variants of the "ALTER SYSTEM DUMP" statement that might be of interrest:</B><br> <br> By the way: The "ALTER SYSTEM DUMP.." statement has more variants.<br> For example, you can dump blocks from redologs as well. For example:<br> <br> <font face="courier" size=2 color="black"> ALTER SYSTEM DUMP LOGFILE 'C:\ORADATA\SALES\redo05.dbf' DBA MIN 5 17221 DBA MAX 5 17250; <br> ALTER SYSTEM DUMP LOGFILE 'C:\ORADATA\SALES\redo05.dbf' TIME MIN 528314633; <br> <br> (DBA: database block address like we have used before, for example "datafile 5" and "blocknumber 17221")<br> <br> <font face="arial" size=2 color="black"> <B>-- Using the SET EVENTS statement:</B><br> <br> Again, there is much to learn from the "SET EVENTS" statement. For example take a look at this:<br> <br> <font face="courier" size=2 color="black"> -- Dump control file contents:<br> alter session set events 'immediate trace name CONTROLF level 10'<br> <br> -- Dump file headers:<br> alter session set events 'immediate trace name FILE_HDRS level 10'<br> <br> -- Dump redo log headers:<br> alter session set events 'immediate trace name REDOHDR level 10'<br> <br> <font face="arial" size=2 color="black"> I hope that you practice this note on a testsystem. If so, try the above SET EVENTS commands and take a look at the trace files.<br> <br> <br> <B>3.7.2 A few words on the Oracle supplied "BBED" utility.</B><br> <br> The BBED (Block Browser and EDitor) utility, is a tool from Oracle itself.<br> However, It looks as if it <B>not present on all</B> Oracle versions (and platforms), which is somewhat "strange".<br> <br> It's a very powerfull tool, and just as it's name implies, you can <B>view</B> and <B>modify</B> database blocks.<br> <br> For the Windows platform, it's an executable, ready for use. On unix, you first need to link it, before you can use it.<br> On Windows, I found it in the Oracle 8i EE Release 3, v 8.1.7, and older versions, but, for example, I could not find it in 10g or 11g.<br> So, let's try that "old" one. But the utility is password protected, so let's find that first.<br> <br> <img src="oraclecorr9.jpg" align="center"/> <br> <br> Using xvi32, opening "bbed.exe", and searching on the string "password", the password is quicly found.<br> It's "blockedit". By the way, the password is "all over the Internet" as well.<br> So let's start it:<br> <br> <font face="courier" size=2 color="black"> C:\ora8i\bin>BBED<br> Password:<br> <br> BBED: Release 2.0.0.0.0 - Limited Production on Thu Oct 28 12:03:36 2010<br> <br> (c) Copyright 2000 Oracle Corporation. All rights reserved.<br> <br> ************* !!! For Oracle Internal Use only !!! ***************<br> <br> BBED><br> <br> <font face="arial" size=2 color="black"> The banner that is shown, is somewhat intimidating, warning us that usage is for "Oracle Internal Use only".<br> This is not to take lightly. Ofcourse, on testsystems you can do what you like,<br> but for production, <B>never</B> use it without Oracle Support guidance.<br> <br> There exists an excellent article, describing the use of BBED, including some amazing examples.<br> You can find it using this url:<br> <br> http://orafaq.com/papers/dissassembling_the_data_block.pdf<br> <br> If you have browsed through that document, you probably agree that I don't need to spend any more words on bbed.<br> <br> <br> For now, I conclude this simple note. Hope you have liked it!<br> <br> <br> <font face="arial" size=2 color="blue"> <h3>What about regular Backup and Recovery techniques ?</h3> Having a optimal backup/recovery policy in place, is the best defence against <B>any</B> kind of corruption,<br> disk errors, and all sorts of other faults (like human error etc..).<br> <br> Originally, I had planned a Chapter 4, dealing on Backup and Recovery techniques.<br> Now, I think it's better to create a <B>seperate document</B> on that subject, and thus not include it in a note about "block corruption".<br> <br> <br> <br> <font face="arial" size=2 color="black"> <br> <br> <br> </body> </html>