vrijdag 4 januari 2013

Indexes

Blocks 

A block is the smallest unit of disk that Oracle will read or write. All data in Oracle - tables, indexes, clusters - are stored in blocks. The block size is configurable for any given database but is usually one of 4Kb, 8Kb, 16Kb, or 32Kb.
Rows in a table are usually much smaller than this, so many rows will generally fit into a single block. So you never read "just one row"; you will always read the entire block and ignore the rows you don't need. Minimising this wastage is one of the fundamentals of Oracle Performance Tuning. 

Index

Oracle uses two different index architectures: b-Tree indexes and bitmap indexes. Cluster indexes, bitmap join indexes, function-based indexes, reverse key indexes and text indexes are all just variations on the two main types. b-Tree is the "normal" index


B-tree

B-tree index is een datastructure in the form of a tree. But a tree of database blocks and not a tree of rows


Imagine the leaf blocks of the index as the pages of a phone book.
  • Each page in the book (leaf block in the index) contains many entries, which consist of a name (indexed column value) and an address (ROWID) that tells you the physical location of the telephone (row in the table).
  • The names on each page are sorted, and the pages - when sorted correctly - contain a complete sorted list of every name and address
Branch-blocks of an index:  a reduced list that contains the first row of each block plus the address of that block. In a large phone book, this reduced list containing one entry per page will still cover many pages, so the process is repeated, creating the next level up in the index, and so on until we are left with a single page: the root of the tree

example
To find the name Gallier in this b-Tree phone book, we:
  • Read page 1. This tells us that page 7 starts with Fenicier and that page 8 starts with Hottentot
  • Read page 7. This tells us that page 600 starts with Fries and that page 601 starts with Gallier.
Read page 600, which is a leaf block; we find Gallier address and phone number. So 3 blocks to find a specific row in a million row table. In reality, index blocks often fit 100 or more rows,
balanced
B-tree is a balanced tree. Every branch will be the same length. If a leaf block is full then it wil be splitted in two blocks of the same length.


How are Indexes used?

Indexes have three main uses:
  • To quickly find specific rows by avoiding a Full Table Scan
We've already seen above how a Unique Scan works.  Range Scans can occur when we use >, <, LIKE, or BETWEEN in a WHERE clause. A range scan will find the first row in the range using the same technique as the Unique Scan, but will then keep reading the index up to the end of the range. It is OK if the range covers many blocks.
  • To avoid a table access altogether
If the information is in the index, then it doesn't bother to read the table. It is a reasonably common technique to add columns to an index, not because they will be used as part of the index scan, but because they save a table access. In fact, Oracle may even perform a Fast Full Scan of an index that it cannot use in a Range or Unique scan just to avoid a table access.
  • To avoid a sort
If a sort operation requires rows in the same order as the index, then Oracle may read the table rows via the index. A sort operation is not necessary since the rows are returned in sorted order.


FTS vs Index scan

So obviously Full Table Scan is the faster way to pick up every leaf. But just as obvious is that the index  is the faster way to pick up just the smallest leaf, or even the 100 smallest leaves. As the number rises, we approach a break-even point; a number beyond which it is faster to just full table scan. This number varies depending on the table, the index, the database settings, the hardware, and the load on the server; generally it is somewhere between 1% and 10% of the table.

Know your data! If your query needs 50% of the rows in the table to resolve your query, an index scan just won't help. Not only should you not bother creating or investigating the existence of an index, you should check to make sure Oracle is not already using an index.The exception to this rule - there's always one - is when all of the columns referenced in the SQL are contained in the index. If Oracle does not have to access the table then there is no break-even point; it is generally quicker to scan the index even for 100% of the rows.


How good is the index?

 low cardinality fields make bad indexes. Why is this the case? The answer here is really about selectivity.

                unique index values
selectivity = -----------------------
                total number records

A primary key is highly selective. If there are 1000 rows, there will be 1000 unique keys in the index. Every unique key will return at most 1 row. The index will be 100% selective (1000/1000).. the best you can get.
Now let's say we have an index on a low cardinality thing like gender. If we had 1000 records, the selectivity is in the database is 2/1000 =.2%. Said in another way, 500 records come back per unique key (1000 records / 2 uniques).

 

Geen opmerkingen:

Een reactie posten