Tuning an NT Server for Oracle

NT General Considerations

Disk Systems
Nt supports multiple file systems as NTFS, FAT and CDFS, including raw.

- FAT
In this system the entire partition is considered as a table (File Allocation Table or FAT) of addresses in which a file/directory ocuppies some slots.
The operating system maintains a copy of the allocation table in the partition, so that when the primary FAT is corrupted, the copy can be used. If both are corrupted, then the partition becomes unusuable.
This is the most popular and it can be used when booting from DOS.
Its simplicity has its own deficiency, it can access up to 4 GB on NT (2 GB on DOS).
The FIFO often leads to fragmentation

Many sites have observed faster access with FAt file systems. This may be due to the lack of file system loggin, which NTFS uses to guarantee file system integrity. Still, the lack of security means that any user may inadvertenly harm the database, and file systems have been destroyes as a result of unintentional power outages. So it is hard to justify usinf the FAT file system for a production system, especially when the performance improvement is questionable.

- NTFS
Provides a high level of security for files and folders.
The log file service guarantees the integrity of the file system, not the integrity of the data in the file system. This means that NTFS does not protect Oracle data that has been committed. So you must still use the oracle log file, which will protect your committed data. NTFS, although it has some performance and fragmentation problems, is a good compromise. It protects important file system structures, and Oracle protects important data.

- Comparission
Reliability
    - NTFS logs for file system integrity
    - FAT can lose important structures

Performance
    - FAT my be faster for smaller file systems
    - NTFS is faster on large ( 500MB) file systems

Ease of use
    - NTFS can support long file names

Is better to use NTFS for Oracle data files
 

- CPU Usage
2 CPU's have very god scaing, but the scaling usually drops off quickly after that and is VERY platform specific. Ask your vendor about scaling with CPUs. Generally with more CPUs our performance will improve. Highly utilized CPU's are not a problem until utilization exceeds 80%.
 

- Memory Management in NT
The maximum memory addresseable in NT (owing a 32-bit address) is 4 GB. owever, NT reserves 2GB. With Service Pack 3, applications can access up to 3 GB.
Management of virtual memory is considered in equally sized pieces called pages. Each page frame is 4 KB in size.
Every process is allocated some page frames in physical memory. These pages are said to be valid pages. Any page not in physical memory is considered invalid. Page fault occurs when a thread of process requests for a page that is marked as invalid. The memory manager loads that page from disk into an available page frame. When page frames are free, memory manager copies the pages to the virtual extension (disk) and makes room for new pages.
Windows NT uses a FIFO policy for each process to manage which pages get moved out when a page fault occurs. Each process gets a guaranteed minimum number of pages in memory, called the minimum working set. If enough physical memory is free, the available number of pages for a process can grow to the process maximum working set size.
Memory allocated to a threat is called reserved memory. the portion of virtual memory that has space reserved in the backing store (disk) is called committed memory.

- Performance Monitor
The first thing any administrator should do is determine exactly what's happening on a server that he wishes to speed up. Knowing what's happening on a server is crucial to being able to determine what specific areas need tuning. This is where the program called Performance Monitor comes in. This is a free program that comes standard with Windows NT Workstation and Server
Regardless of what other counters you might choose to monitor, there are four that are considered by many to be essential. We'll take a brief look at each of them.

The first is Memory counter. This counter measures such things as available RAM, the amount of pages per second, cache faults and much more. This is often considered to be the most crucial counter because memory can have such a negative effect on other counters.
For example, if your server is paging out to the disk a lot, this can cause the disk performance on a server to slow considerably. This might cause someone to replace or update the disks on a server, which will have little or no impact on performance. Increasing the memory, however, will more than likely allow disk access speeds to return to normal.

Another counter to monitor is the Processor counter. This counter measures how much time the processor is in use and the number of interrupts it's experiencing per second. As with other counters, it's very important to look at the Memory counter results along with the processor counter resluts so you're getting the entire picture.

The next set of counters that you'll want to monitor is the Disk counters. These counters not only measure how often a disk is servicing requests and how quickly, but they can also tell you things such as how much disk space is currently available.
Note: By default, Disk counters aren't enabled on Windows NT machines. This is something that an administrator has to do manually. To enable these counters, you need to type diskperf -y either in the Run dialog box or at a command prompt. You'll then be prompted to restart, and once you do, you'll be able to monitor disk performance and usage.

The final counter that's important to monitor is the Network Interface. This counter allows you to view items such as bytes per second and measure packets.

When you've chosen which counters you want to monitor, you'll need to decide over what period of time to monitor them. You'll also want to determine the intervals that Performance Monitor logs the data. Once you've logged the data that you want, place the log file in a safe place. You'll then be able to measure the effects of any changes that you make. You'll also be able to identify bottlenecks that might creep up as your user base grows. We'll now look at specific things that you can do to enhance performance on an NT server.
 

Setting up Your NT Server
* Change the performance boost
By default, the application currently in the foreground on Windows NT is given a higher priority than background processes. On a machine running an Oracle8i database, it is not desirable for a foreground application to have precedence over the database process. To change the default behavior such that the foreground application does not receive a performance boost, go to “My Computer/Properties/Performance” then, select “None” as the setting for “Performance Boost for the Foreground Application.”

* Set your NT File Cache small
You always want to set your NT File Cache to the smallest amount possible since Oracle does not use it for DB operations.
The memory manager for Windows NT defines three different pools of memory that are allocated from available RAM. These pools include one pool for the operating system kernel and other system services, one pool for the file cache, and one pool for paged memory available to user applications. By default, a Windows NT Server is set up to be a file and print server. As a result, the file cache component of system memory is large in anticipation of file server activity. However, when running an Oracle8i database, the file cache is not really needed at all since all Oracle file I/O operations bypass the file cache altogether and force data to be written directly to disk. This is required in order to ensure data integrity. In addition, the memory Oracle allocates for the SGA acts as Oracle’s own private file cache, again making an operating system cache unnecessary.
In the control panel select “Network/ Services / Server/Properties” and choose "Maximize Throughput for Network Applications"
The Maximize Throughput For File Sharing option is used primarily when you're using a server for file and print services. This option allocates all available memory to file systems caching, which will increase disk performance.
The Maximize Throughput For Network Applications option should be used on servers that are running memory-intensive server applications, as well as domain controllers. There have been reports of domain controllers responding three to four times faster simply by selecting this option.

* Remove unused Network Protocols
Since Windows NT supports many network protocols, it is common to have several enabled in a typical installation. However, very often not all of these protocols are required or used when running an Oracle8i database. If there are unnecessary protocols installed on a system, it is recommended that they be removed so that the operating system does not devote processing time to these protocols.
To remove unnecessary network protocols, In the Control Panel select “Network Neighborhood/ Protocols”. Select any unnecessary protocols not needed for the system’s configuration and click Remove.

* Reset the Network Protocol Bind Order
If there is a need to have several protocols installed on the server machine, the ones used most by Oracle should be given priority over those that are not. To reset the network protocol bind order, go to Start ® Settings ® Control Panel ® Network. Click the Bindings tab. Show bindings for all services, then double-click  server to see the list of current protocols. Verify that the protocol used by Oracle is at the top of the list, and if it is not, make it so by selecting it and moving it up until it is at the top. If there are multiple network cards installed in the machine, then verify that the card used most frequently by Oracle is at the top of the list for each protocol as well. To do this, double click each protocol that Oracle uses, and move the appropriate network card to the top of the list for that protocol.

* Remove Services you don’t need
If you navigate to the Control Panel folder and open the Services dialog box, you'll see a list of services installed on a machine. They are disabled, set to manual, or set to automatic.
The issue here is deciding what really needs to be turned on. Is your server hooked up to a printer? If not, then why have the spooler service enabled? What about other services such as Remote Access Service (RAS), License Logging Service, Network DDE, or services loaded by the manufacturer of the server? What about Plug n Play? Do these need to be running on your server?

* Use a dedicated server for Oracle.
In general, the Windows NT computer that is running your Oracle8i database should not serve as any of the following:

These services consume network, memory, and CPU resources. In addition, the Windows NT computer that is running your Oracle8i database should not be locally accessed with a high frequency or intensively used for local user processing, unless there exist significant resources to accommodate all this activity.

* Pafe File Sizing
Although excessive paging is always discouraged when trying to achieve good performance from an application, it is recommended that the total combined size of all page files in the system be at least equal to the amount of RAM in the computer. Many installations go beyond this amount and have combined page file sizes that are two or more times as large as the amount of RAM in the machine. Sizing page files in this way provides enough cushion to avoid a situation in which NT runs out of page file space and is unable to successfully perform the tasks required of it. In addition, since NT balances page file activity across page files, it is important to place different page files on different physical disks in order to balance the I/O load on those disks.

*Eliminate access updates
Each time you access a directory on a Windows NT NTFS volume, by using NT Explorer or the DIR command, for example, NT updates the LastAccess time stamp on each directory it detects. If your system contains a very large number of directories or supports a large number of users and you don't care about the LastAccess stamp, the small performance hit required for each directory update adds up to wasted performance without producing any real value. To prevent the LastAccess time stamp from being updated, start Regedit and drill down to the key  HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem
From the Edit menu, select New, then DWORD Value. Name the new value NtfsDisableLastAccessUpdate. Set its value to 1 to disable the LastAccess update.

 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem] "NtfsDisable8dot3NameCreation"=dword:00000001

Tuning NT Memory

Tuning Memory
* Allow 40 Megabytes + for Oracle SGA
* NT requires 12 Megabytes + of Memory
* Allow 1.5 Meg of Memory per Client user
* Allow 3 Meg per Server User

Memory Monitoring
* Ensure no process is using excessive Memory
* Windows NT tries to have 4MB free at all times
* If Free Memory < 4 Meg there is a definite memory problem

Virtual Memory
Another easy memory configuration change that you can make has to do with the NT paging file. The paging file settings can be changed by going to the System control panel, selecting the Performance tab, and clicking the Change button under the Virtual Memory section.
Most experts agree that spreading the paging file on to more than one disk is one very quick way to increase memory performance on an NT server. This can increase performance because more than one disk is available to handle the requests. Another recommendation is to set the initial and maximum sizes of the paging file(s) to the same amount. This way NT won't attempt to grow the paging file and cause it to fragment.
* As a rule of thumb, set Virtual Memory to twice Physical Memory
* Have separate page files on dedicated disks (NT handles load balancing)
* Don’t have them on RAID5 or Network Drives
* NT Pages only, does not swap
 

Critical Oracle Memory Parameters
* Set Buffer Cache = 10 Meg + (DB_BLOCK_SIZE*DB_BLOCK_BUFFERS)
* Set SHARED_POOL_SIZE = 9 Meg +
* Set LOG_BUFFER = 132K + (Compile Package 15 sec 132K 24 sec 8k)
* Set SORT_AREA_RETAINED_SIZE = 32K
* Set SORT_AREA_SIZE = 500K

Serious Memory Tuning
* To give administrators some tuning flexibility, Oracle provides a utility called ORASTACK that enables an administrator to lower the stack size for Oracle threads from 1MB down to a smaller number. This allows for either higher connection counts or a larger SGA in those cases where Oracle is bumping up against the 3GB address space limit. If an application does very little highly recursive SQL or not much in the way of nested triggers or stored procedures, then turning the stack size down to 300K (for instance) is a safe procedure that will save on address space usage. If the stack space is decreased too much, typical behavior will be “ORA-03113: end-of-file on communication channel” errors returned to the client as its foreground thread terminates with a stack overflow error.
Run Orastack on listener & Oracle kernel
    'orastack oracle.exe 500000'
    'orastack tnslsnr.exe 500000'
    'orastack svrmgrl.exe 500000'
By reducing the stack of every session created in the Oracle executable, it is possible to achieve a larger user population. In a system with a 1000 users reducing the stack from 1Mb to 500K would release 500Mb of the address space for other allocations or more users.
In general, ORASTACK is only called for in high-end installations where there are several hundred or more connections to the database or when the SGA is very large (over 2GB in size). To determine how much of the 3GB address space is in use by Oracle, run Windows NT Performance Monitor by going to Start ® Programs ® Administrative Tools ® Performance Monitor. Once in Performance Monitor, choose Edit ® Add to Chart... and select the Virtual Bytes counter of the Process object for the Oracle instance. If this value displayed is close to 3GB, then the amount of available address space for Oracle is running low.

* Set Registry ORA_WORKINGSETMIN to 8 (MB) in HKEP_LOCAL_MACHINE - SOFTWARE - ORACLE
Customers who are using these registry entries should consider using them in combination with the init.ora parameter PRE_PAGE_SGA, which causes Oracle to touch all the SGA pages (including the buffer cache), bringing them into the working set of the Oracle executable. This increases instance startup time but allows the instance to reach its maximum performance capability quickly, rather than through an incremental build up as pages are loaded.

ORA_WORKINGSETMIN is the most useful parameter and prevents the working set of the Oracle process from dropping below the threshold until the instance
is shutdown :
    a) If used in combination with PRE_PAGE_SGA, the working set will start above the minimum threshold and not drop below.
    b) If ORA_WORKINGSETMIN is used in isolation, then once the working set rises above the threshold it will not drop below.

Always use these parameters with caution, because they change NT's default behaviour. Before using these parameters, ensure that the page file is large enough and  remember that pages above the minimum threshold can still be paged out.
 

Tuning NT Disk I/O

Commands to Monitor Disks
* Diskperf requires rebooting machine, is a file that let you monitor disk I/O
    diskperf - y  # for unstriped disks
    diskperf - e # for striped disks
    diskperf - y \\ofst # turns on the monitoring for a remote computer called ofst
    diskperf -n to unload it
* You can run NT’s performance monitor with Oracle Performance Monitor

Disk Striping
* RAID 0 (disk striping without parity) runs on NT 4.0 Server and Workstation
* RAID 1 (disk mirroring) Server only
* RAID 5 (striping with parity) Server only
* Use Hardware based RAID5
* Set “write back caching” on RAID controller, not default “write through”
* Do not use Raid-5 in write intensive environments
* Balance I/O across controllers
* Oracle recommends the following: Online or archived redo log files can be put on RAID 1 devices. You should not use RAID 5. ’TEMP’ tablespace data files should also go on RAID1 instead of RAID5 as well. The reason for this is that streamed write performance of distributed parity (RAID5) isn’t as good as that of simple mirroring (RAID1). Swap space can be used on RAID devices without affecting Oracle. Place REDO logs on separate fast devices, separate DATA and INDEX datafiles and avoid RAID 5 for Oracle files.

When selecting a disk subsystem, these general rules should be followed:
· SCSI is preferable to IDE.
· Fast wide SCSI is preferable to narrow/standard SCSI.
· RAID level 5 is preferable to a single disk.
· A hardware-based RAID controller is preferable to software RAID.
· More spindles are preferred over fewer spindles, to distribute writes.
· A caching disk controller is preferable to a non-caching controller, but a caching controller must have a battery backup to ensure data integrity.
 

Disk Striping Advice
* Striping runs well on NT 4.0 Server
* OLTP ideal is both RAID 0 and RAID 1
* Do not use RAID 0 without RAID 1
* RAID 5 is ideal for Data Warehouses, Reporting databases
* Mixed OLTP and DSS may consider RAID 0,1,5 combination

Other Disk Considerations
* Consider raw devices - Avoids NT File Cache
 

Table with 
230858 rows

Using Raw

Using Cooked

Select count

5.8 secs

12.4 secs

Group by

14.8 secs

21 secs

Index

18.0 secs

28.9 secs

Disk Fragmentation
NT 4.0 does NOT come with a disk defragmentation tool. You can get it at:
http://www.microsoft.com/infosource
http://www.sun-belt.com (to download diskeeper
Before running those tools, shut down the database

File Systems (FAT vs NTFS)
* Windows NT still supports the FAT (circa 1980) file systems but a superior choice for Oracle database, log and control files is the NTFS. This is because NTFS provides a journaled file system for roll-forward and roll-backward recovery, RAID 0, 1 and 5 supported in the OS, and enhanced auditing support and security functionality. NTFS will also outperform FAT on larger disk volumes. For the BOOT partition, it is recommended to use FAT for floppy boot capability.
Use NTFS files system for ORACLE files, FAT for the boot partition only.
* Must use FAT if dual boot machine with DOS or Windows 95
* NTFS has enhanced security and robustness
* Use NTFS for file compression
* FAT better if disks under 1 Gigabyte
* NTFS disks better if disks over 1 Gigabyte
* Never use compressed NTFS files!
 

Tuning NT CPU

Oracle CPU Usage
* Oracle makes excellent use of NT multiprocessors
* Set DB_BLOCK_LRU_LATCHES and LOG_SIMULTANEOUS_COPIES  = # CPUs  (not sure)
Since Windows NT and Windows 2000 use asynchronous I/O by default, using multiple DBWR processes may not necessarily improve performance.
Increasing this parameter could, in fact, reduce performance on systems where the CPU's are already over burdened.  In cases where the main performance bottleneck is that a single DBWR process cannot keep up with the work load, then increasing the value for DB_WRITER_PROCESSES may improve performance.
The main reason to manually set this parameter is to allow you to configure multiple buffer pools in Oracle8.0 and Oracle8i or if you want to use multiple DBWR processes. There must be at least one latch per DB writer gatherer process. However, this option is mutually exclusive with Async I/O in both Oracle7 and Oracle8.
* Check 'cpu_count' to make sure it is reflecting the correct number of cpu(s). You should not have to set it, it is automatically set by Oracle, but it would be interesting to know if it is reporting the correct value.
* Multiple CPUs best used with striped data files
* The amount of data read in a multiblock read is specified by the Oracle initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. The value for this parameter should always be set high because there is rarely any disadvantage in doing so. The size of the individual I/O requests depends on both DB_FILE_MULTIBLOCK_READ_COUNT and DB_BLOCK_SIZE. A good value for multiblock reads is 64K.
* Multiblock writes are similar in nature to multiblock reads and have many of the same requirements. Multiblock writes are available through the direct path loader as well as through sorts and index creations. As with multiblock reads, the multiblock writes reduce I/O and CPU overhead by writing multiple database blocks in one larger I/O operation. The amount of data written in a multiblock write is specified by the Oracle initialization parameter DB_FILE_MULTIBLOCK_WRITE_COUNT. The size of the individual I/O requests depends on both DB_FILE_MULTIBLOCK_WRITE_COUNT and DB_BLOCK_SIZE. As with multiblock reads, a good value is 64K.
* Consider parallel query
The Oracle Parallel Query option makes it possible for some Oracle functions to be processed by multiple server processes. The functions affected are queries, index creation, data loading, and recovery. For each of these functions, the general principle is the same: keep the processing going while Oracle is waiting for I/O. For most queries, the time spent waiting for the data to be retrieved from disk usually overshadows the amount of time actually spent processing the results. With the Parallel Query option, you can compensate for this "wasted time" by using several server processes to execute the query. While one process is waiting on I/Os to complete, other processes can execute. If you are running on a Symmetric Multiprocessor (SMP) computer, a cluster, or a Massively Parallel Processing (MPP) machine, you can take maximum advantage of the Parallel Query option. The amount of parallelism can be tuned with several of the Oracle initialization parameters:
- PARALLEL_DEFAULT_MAX_SCANS = Specifies the maximum number of query servers to used by default for a query. This valued is used ONLY if no value is specified in a PARALLEL hint or in the PARALLEL definition clause. This parameter limits the number of query servers used by default when the value of the PARALLEL_DEFAULT_SCANSIZE is used by the query coordinator.
- PARALLEL_DEFAULT_SCANSIZE = Specifies the number of query servers to be used for a particular table. The size of the table divided by PARALLEL_DEFAULT_SCANSIZE determines the number of query servers, up to PARALLEL_DEFAULT_MAX_SCANS.
- PARALLEL_MAX_SERVERS = Maximum number of query servers or parallel recovery processes available for this instance.
- RECOVERY_PARALLELISM =  The number of processes to be used for instance or media recovery. A large value can greatly reduce instance recovery time. A value of zero or 1 indicates that parallel recovery will not be done and that recovery will be serial. A good value for this parameter is in the range of the number of disks you have (up to 50). I am a real fan of the Parallel Query option. I have seen great improvements from the use of parallel queries as well as dramatic reductions in recovery time when the parallel recovery feature is used. With Oracle8i.new parameters allow for more automatic configuration of parallel query.
 
 

Tuning Network
Network Tuning
* For best performance make heavy use Oracle stored code
* Simple ping or tnsping command is often best to identify bottlenecks
* NT Performance Monitor has some network info after connection made
* Best to use “Network Monitor”

Using Network Monitor
* Select Network icon in Control Panel
* Click Services tab and click “Add”
* In the Network Services box select Network Monitor Tools and Agent
* Now you can check the Performance Monitor. View the “% Network Utilization” figure
* “#Frames Dropped” is frames not processed because network buffers full
 

Registry Settings
Registry Parameters
* HKEY_LOCAL_MACHINE\System\
   CurrentControlSet\Control\Session Manager\Memory Management\DisablePagingExecutive
Change from default 0 to 1 will force NT to keep all of its kernel function resident in memory, thus helping them to execute faster.

* HKEY_LOCAL_MACHINE\System
  \CurrentControlSet\Control\FileSystem \NtfsDisableLastAccessUpdate
Change REG_DWORD from 0 to 1 will prevent updates of last access date time stamp on directories

* HKEY_LOCAL_MACHINE\System\
   CurrentControlSet\Control\ Session Manager \ Memory Management \LargeSystemCache
Change to 3 for “Network Maximizing”
   n.b. 0 = Minimize Memory Usage, 1 = Balance all memory usage, 2 = Maximize File Sharing

Recommended Reading
Tuning and Sizing NT Server - Curt Aubley
Oracle8 & Windows NT Black Book - Mike Curtis and Jacqueline King
Teach Yourself Oracle8i on Windows NT in 24 hours - Megh Thakkar (Sams)

Oracle Specific counters
Oracle has provided a set of Perfmon objects for monitoring what is going on in Oracle. Following is a brief explanation of these counters.
 

Object

Counter

Description




Oracle8 Buffer Cache

 %physreads/gets

The objective is to have the cache miss ratio < 10%, if it is too high increase DB_BLOCK_BUFFERS in the INIT.ORA file.

Oracle8 Dictionary Cache

%getmisses/gets

The objective is to have miss ration < 10%

Oracle8 Data Files

phyrds/sec, phywrts/sec

Can add a chart entry for each datafile, showing the rate at which read/writes are being done.

Oracle8 DBWR stats1

Buffer scanned/sec; LRU scans/sec

Indication of Database writer activity.

Oracle8 DBWR stats 2

Checkpoints/sec, timeouts/sec

Indication of how much time Oracle is spending processing checkpoints - may need to increase LOG file size to reduce.

Oracle8 Dynamic Space Mgmt

Recursive calls/sec

Sustained high rate indicates dynamic space management, you should review space allocation parameters.

Oracle8 Free List

%freelist waits / requests

Contention on freelist means concurrent update activity is slowing - add freelist to updated objects.

Oracle8 Library Cache

%reloads/pins

The ratio should be <10% or you should increase the INIT.ORA parameter SHARED_POOL_SIZE

Oracle8 redo Log Buffer

Redo log space requests

Space request should be near zero, otherwise increate LOG_BUFFER and investigate LOG I/O.

Oracle8 Sorts

Memory/disk sorts /sec

Sorts to DISK should be minimized. Increase INIT.ORA SORT_AREA_SIZE parameter to get more sorts into memory.

Disk Reads/Sec


The number of reads per second to that disk or volume.

Disk Writes/Sec


The number of writes per second to that disk or volume.

Disk Transfers/Sec


The total number of reads and writes per second to that disk or volume.

Avg. Disk sec/Read


The average time it takes for the read operation to occur.

Avg. Disk sec/Write


The average time it takes for the read operation to occur.

Avg. Disk sec/Transfer


The average time it takes for the read operation to occur.

Avg. Disk Queue Length


The average number of I/O’s in the disk I/O subsystem.


Pages/sec counter

Indicate that the system is paging. If you see significant activity with this counter, you are using too much memory.