/****************************************************************************/ /* Document : UNIX command examples, mainly based on Solaris, AIX, HP */ /* and ofcourse, also Linux. */ /* Doc. Version : 115 */ /* File : unix.txt */ /* Purpose : some examples for the Oracle, DB2, SQLServer DBA */ /* Date : 07-07-2009 */ /* Compiled by : Albert van der Sel */ /* Best use : Use find/search in your editor to find a string, command, */ /* or any identifier */ /****************************************************************************/ ############################################ SECTION 1. COMMANDS TO RETREIVE SYSTEM INFO: ############################################ ========================== 1. HOW TO GET SYSTEM INFO: ========================== 1.1 Short version: ================== See section 1.2 for more detailed commands and options. Memory: ------- AIX: bootinfo -r lsattr -E -l mem0 lsattr -E -l sys0 -a realmem svmon -G vmstat -v vmo -L lparstat -i or use a tool as "topas" or "nmon" (these are utilities) Linux: cat /proc/meminfo dmesg | grep "Physical" free (the free command) HP: getmem print_manifest |grep –i memory dmesg | grep -i phys echo "selclass qualifier memory;info;wait;infolog"|cstm wc -c /dev/mem or us a tool as "glance", like entering "glance -m" from prompt (is a utility) Solaris: prtconf | grep "Memory size" # total memory prtmem memps -m Tru64: vmstat -P | grep "Total Physical Memory" uerf | grep memory Swap: ----- AIX: lsps -a lsps -s pstat -s HP: swapinfo -a Solaris: swap -l prtswap -l Linux: swapon -s cat /proc/swaps cat /proc/meminfo cpu: ---- HP: ioscan -kfnC processor getconf CPU_VERSION getconf CPU_CHIP_TYPE model AIX: lparstat (-i) prtconf | grep proc pmcycles -m lsattr -El procx (x is 0,2, etc..) lscfg | grep proc pstat -S mpstat Linux: cat /proc/cpuinfo Solaris: psrinfo -v prtconf psrset -p prtdiag OS version: ----------- HP: uname -a Linux: cat /proc/version Solaris: uname -a cat /etc/release (or other way to view that file, like "more /etc/release") Tru64: /usr/sbin/sizer -v AIX: oslevel -r (only high-level version) oslevel -s (shows Version, SP, TL level) oslevel -qs (shows complete history) lslpp -h bos.rte AIX Example: # oslevel -s 5300-08-03-0831 # oslevel -qs Known Service Packs ------------------- 5300-08-03-0831 5300-08-02-0822 5300-08-01-0819 5300-08-00-0000 5300-07-05-0831 5300-07-04-0818 5300-07-03-0811 5300-07-02-0806 5300-07-01-0748 5300-06-08-0831 5300-06-07-0818 5300-06-06-0811 5300-06-05-0806 5300-06-04-0748 5300-06-03-0732 5300-06-02-0727 5300-06-01-0722 5300-05-CSP-0000 5300-05-06-0000 5300-05-05-0000 5300-05-04-0000 5300-05-03-0000 5300-05-02-0000 5300-05-01-0000 5300-04-CSP-0000 5300-04-03-0000 5300-04-02-0000 5300-04-01-0000 5300-03-CSP-0000 AIX firmware: lsmcode -c display the system firmware level and service processor lsmcode -r -d scraid0 display the adapter microcode levels for a RAID adapter scraid0 lsmcode -A display the microcode level for all supported devices prtconf shows many setting including memory, firmware, serial# etc.. Notes about Power 4 or 5 lpars: ------------------------------- For AIX: The uname -L command identifies a partition on a system with multiple LPARS. The LPAR id can be useful for writing shell scripts that customize system settings such as IP address or hostname. The output of the command looks like: # uname -L 1 lpar01 The output of uname -L varies by maintenance level. For consistent output across maintenance levels, add a -s flag. For illustrate, the following command assigns the partition number to the variable "lpar_number" and partiton name to "lpar_name". For HP-UX: Use commands like "parstatus" or "getconf PARTITION_IDENT" to get npar information. patches: -------- AIX: Is a certain fix (APAR) installed? instfix -ik APAR_number instfix -a -ivk APAR_number To determine your platform firmware level, at the command prompt, type: lscfg -vp | grep -p Platform The last six digits of the ROM level represent the platform firmware date in the format, YYMMDD. HP: /usr/sbin/swlist -l patch swlist | grep patch Linux: rpm -qa Solaris: showrev -p pkginfo -i package_name Tru64: /usr/sbin/dupatch -track -type kit Netcards: --------- AIX: lsdev -Cc adapter lsdev -Cc adapter | grep ent lsdev -Cc if lsattr -E -l ent1 ifconfig -a Solaris: prtconf -D / prtconf -pv / prtconf | grep "card" prtdiag | grep "card" svcs -x ifconfig -a (up plumb) Quickly find out who is using most memory: ------------------------------------------ See section marked &&& (use find/search on &&&) Network sniffing: ----------------- Here are a few short descriptions, and examples, of usefull network trace / dump commands. -- Solaris: snoop command examples: For example, if we want to observe traffic between systems alpha and beta we can use the following command: # snoop alpha,beta To enable data captures from the snoop output without losing packets while writing to the screen, send the snoop output to a file. For example: # snoop -o /tmp/snooper -V 128.50.1.250 To snoop a specific port: # snoop -o port xxx -- AIX: tcpdump command examples: # tcpdump port 23 # tcpdump -i en0 A good way to use tcpdump is to save the network trace to a file with the -w flag and then analyze the trace by using different filtering options together with the -r flag. The following example show how to run a basic tcpdump network trace, saving the output in a file with the -w flag (on a Ethernet network interface): # tcpdump -w /tmp/tcpdump.en0 -i en0 To limit the number of traced packets, use the -c flag and specify the number, such as in the following example that traces the first 128 packets (on a token-ring network interface): # tcpdump -c 128 -w /tmp/tcpdump.tr0 -i tr0 iptrace command examples: To start the iptrace daemon with the System Resource Controller (SRC), # startsrc -s iptrace -a "/tmp/nettrace" To stop the iptrace daemon with SRC enter the following: # stopsrc -s iptrace To record packets coming in and going out to any host on every interface, enter the command in the following format: # iptrace /tmp/nettrace The recorded packets are received on and sent from the local host. All packet flow between the local host and all other hosts on any interface is recorded. The trace information is placed into the /tmp/nettrace file. To record packets received on an interface from a specific remote host, enter the command in the following format: # iptrace - i en0 -p telnet -s airmail /tmp/telnet.trace The packets to be recorded are received on the en0 interface, from remote hostairmail, over the telnet port. The trace information is placed into the /tmp/telnet.trace file. To record packets coming in and going out from a specific remote host, enter the command in the following format: # iptrace -i en0 -s airmail -b /tmp/telnet.trace The packets to be recorded are received on the en0 interface, from remote host airmail. The trace information is placed into the /tmp/telnet.trace file. -- HPUX: nettl command: Initialize the tracing/logging facility: # nettl -start Logging is enabled for all subsystems as determined by the /etc/nettlgen.conf file. Log messages are sent to a log file whose name is determined by adding the suffix .LOG000 to the log file name specified in the /etc/nettlgen.conf configuration file. To stop the tracing facility: # nettl -stop Turn on inbound and outbound PDU tracing for the transport and session (OTS/9000) subsystems and send binary trace messages to file /var/adm/trace.TRC000. # nettl -traceon pduin pduout -entity transport session \ -file /var/adm/trace Session using nettl and the formatter netfmt: 1. Capture packets nettl -tn all -e ns_ls_ip -tm 99999 -size 1024 -f some-raw-capture-file 2. Reproduce problem. 3. Turn off trace: nettl -tf -e all 4. Create formatter filter file. Example: filter tcp_sport 6699 filter tcp_dport 6699 5. Filter the packets: 5.1 "Long" display netfmt -Nlnc filter-file -f some-raw.capture > formatted.out 5.2 "One-liner" display netfmt -Nln1Tc filter-file -f some-raw.capture > one-liner.out -- Restart inetd, nfs: -- ------------------- Starting and stopping NFS: -------------------------- On all unixes, a number of daemons should be running in order for NFS to be functional, like for example the rpc.* processes, biod, nfsd and others. Once nfs is running, and in order to actually "share" or "export" your filesystem on your server, so remote clients are able to mount the nfs mount, in most cases you should edit the "/etc/exports" file. -- AIX: The following subsystems are part of the nfs group: nfsd, biod, rpc.lockd, rpc.statd, and rpc.mountd. The nfs subsystem (group) is under control of the "resource controller", so starting and stopping nfs is actually easy # startsrc -g nfs # stopsrc -g nfs Or use smitty. -- Redhat Linux: # /sbin/service nfs restart # /sbin/service nfs start # /sbin/service nfs stop -- On some other Linux distros # /etc/init.d/nfs start # /etc/init.d/nfs stop # /etc/init.d/nfs restart -- Solaris: If the nfs daemons aren't running, then you will need to run: # /etc/init.d/nfs.server start -- HP-UX: Issue the following command on the NFS server to start all the necessary NFS processes (HP): # /sbin/init.d/nfs.server start Or if your machine is only a client: # cd /sbin/init.d # ./nfs.client start Restart or refresh inetd after you have edited "inetd.conf": ------------------------------------------------------------ After you have edited "/etc/inetd.conf", for example, to enable or disable some service, you need to restart, or refresh inetd, to read the new configuration information. To let inetd to reread the configfile: -- AIX: # refresh -s inetd -- HPUX: # /usr/sbin/inetd -c -- Solaris: # /etc/init.d/inetd stop # /etc/init.d/inetd start # pkill -HUP inetd # The command will restart the inetd and reread the configuration. -- RedHat / Linux # service xinetd restart or # /etc/init.d/inetd restart 1.2 More Detail: ================ 1.2.1 Show memory in Solaris: ============================= prtconf: -------- Use this command to obtain detailed system information about your Sun Solaris installation # /usr/sbin/prtconf # prtconf -v Displays the size of the system memory and reports information about peripheral devices Use this command to see the amount of memory: # /usr/sbin/prtconf | grep "Mem" sysdef -i reports on several system resource limits. Other parameters can be checked on a running system using adb -k : # adb -k /dev/ksyms /dev/mem parameter-name/D ^D (to exit) Other commands: --------------- # prtmem # memps -m 1.2.2 Show memory in AIX: ========================= >> Show Total memory: --------=====-------- # bootinfo -r # lsattr -El sys0 -a realmem # prtconf (you can grep it on memory) >> Show Details of memory: -------------------------- You can have a more detailed and comprehensive look at AIX memory by using "vmstat -v" and "vmo -L" or "vmo -a": For example: # vmstat -v 524288 memory pages 493252 lruable pages 67384 free pages 7 memory pools 131820 pinned pages 80.0 maxpin percentage 20.0 minperm percentage 80.0 maxperm percentage 25.4 numperm percentage 125727 file pages 0.0 compressed percentage 0 compressed pages 25.4 numclient percentage 80.0 maxclient percentage 125575 client pages 0 remote pageouts scheduled 14557 pending disk I/Os blocked with no pbuf 6526890 paging space I/Os blocked with no psbuf 18631 filesystem I/Os blocked with no fsbuf 0 client filesystem I/Os blocked with no fsbuf 49038 external pager filesystem I/Os blocked with no fsbuf 0 Virtualized Partition Memory Page Faults 0.00 Time resolving virtualized partition memory page faults The vmo command really gives lots of output. In the following example only a small fraction of the output is shown: # vmo -L .. lrubucket 128K 128K 128K 64K 4KB pages D -------------------------------------------------------------------------------- maxclient% 80 80 80 1 100 % memory D maxperm% minperm% -------------------------------------------------------------------------------- maxfree 1088 1088 1088 8 200K 4KB pages D minfree memory_frames -------------------------------------------------------------------------------- maxperm 394596 394596 S -------------------------------------------------------------------------------- maxperm% 80 80 80 1 100 % memory D minperm% maxclient% -------------------------------------------------------------------------------- maxpin 424179 424179 S .. .. >> To further look at your virtual memory and its causes, you can use a combination of: --------------------------------------------------------------------------------------- # ipcs -bm (shared memory) # lsps -a (paging) # vmo -a or vmo -L (virtual memory options) # svmon -G (basic memory allocations) # svmon -U (virtual memory usage by user) # svmon -P # vmstat -v To print out the memory usage statistics for the users root and steve taking into account only working segments, type: svmon -U root steve -w To print out the top 10 users of the paging space, type: svmon -U -g -t 10 To print out the memory usage statistics for the user steve, including the list of the process identifiers, type: svmon -U steve -l svmon -U emcdm -l # vmo -o npswarn=value # schedo -o pacefork=15 Note: sysdumpdev -e Although the sysdumpdev command is used to show or alter the dumpdevice for a system dump, you can also use it to show how much real memory is used. The command # sysdumpdev -e provides an estimated dump size taking into account the current memory (not pagingspace) currently in use by the system. Note: the rmss command: The rmss (Reduced-Memory System Simulator) command is used to ascertain the effects of reducing the amount of available memory on a system without the need to physically remove memory from the system. It is useful for system sizing, as you can install more memory than is required and then use rmss to reduce it. Using other performance tools, the effects of the reduced memory can be monitored. The rmss command has the ability to run a command multiple times using different simulated memory sizes and produce statistics for all of those memory sizes. The rmss command resides in /usr/bin and is part of the bos.perf.tools fileset, which is installable from the AIX base installation media. Syntax rmss -p -c -r Options -p Print the current value -c MB Change to M size (in Mbytes) -r Restore all memory to use -p Print the current value Example: find out how much memory you have online rmss -p Example: Change available memory to 256 Mbytes rmss -c 256 Example: Undo the above rmss -r Warning: rmss can damage performance very seriously Don't go below 25% of the machines memory Never forget to finish with rmss -r The pstat command: ------------------ The pstat command, which displays many system tables such as a process table, inode table, or processor status table, The pstat command interprets the contents of the various system tables and writes it to standard output. Use the pstat command from the AIX 5.2 command prompt. See the command reference for details and examples, or use the syntax summary in the table below. Flags -a Displays entries in the process table -A Displays all entries in the kernel thread table -f Displays the file table -i Displays the i-node table and the i-node data block addresses -p Displays the process table -P Displays runnable kernel thread table entries only -s Displays information about the swap or paging space usage -S Displays the status of the processors -t Displays the tty structures -u ProcSlot Displays the user structure of the process in the designated slot of the process table. An error message is generated if you attempt to display a swapped out process. -T Displays the system variables. These variables are briefly described in var.h -U ThreadSlot Displays the user structure of the kernel thread in the designated slot of the kernel thread table. An error message is generated if you attempt to display a swapped out kernel thread. &&& --------------------------------------------------------------------------------- Note 1: How to get a "reaonable" view on memory consumption of a process in UNIX: --------------------------------------------------------------------------------- With using just the command line, or some free utils. In general not so easy to answer, because of the "sub components" you might distinguish in memory occupation. For example, do you mean RSS, real, shared, virtual, paging, including all libraries loaded, etc..? -- Some people like to use the ps command with some special flags, like ps -vg ps auxw # or ps auxw | sort -r +3 |head -10 (top users) But those commands seems not so very satisfactory, and not "complete" in their output. -- There are some great common utilities like topas, nmon, top etc.., or tools specific to a certain Unix, like SMC for Solaris. No bad word on those tools, because they are great. But some people think that they are not satisfactory on the subject of memory consumption of a process (although they show a lot of other interesting information). -- Some other ways might be: # procmap pid (in e.g. AIX) # pmap -x pid (in e.g. Solaris) Those tools also show a "total" memory usage, which is a good indicator. For example: # pmap -x $$ 492328: -ksh Address Kbytes RSS Anon Locked Mode Mapped File 00010000 192 192 - - r-x-- ksh 00040000 8 8 8 - rwx-- ksh 00042000 40 40 8 - rwx-- [ heap ] FF180000 680 680 - - r-x-- libc.so.1 FF23A000 24 24 - - rwx-- libc.so.1 FF240000 8 8 8 - rwx-- libc.so.1 FF280000 576 576 - - r-x-- libnsl.so.1 FF310000 40 40 - - rwx-- libnsl.so.1 FF31A000 24 16 - - rwx-- libnsl.so.1 FF350000 16 16 - - r-x-- libmp.so.2 FF364000 8 8 - - rwx-- libmp.so.2 FF380000 40 40 - - r-x-- libsocket.so.1 FF39A000 8 8 - - rwx-- libsocket.so.1 FF3A0000 8 8 - - r-x-- libdl.so.1 FF3B0000 8 8 8 - rwx-- [ anon ] FF3C0000 152 152 - - r-x-- ld.so.1 FF3F6000 8 8 8 - rwx-- ld.so.1 FFBFC000 16 16 8 - rw--- [ stack ] -------- ------- ------- ------- ------- total Kb 1856 1848 48 - This gives you a reasonable idea on memory consumption of a pid. You can also try: # svmon -G # svmon -U # svmon -P -t 10 (top 10 users) # svmon -U steve -l (memory stats for user steve) But svmon is not available on all unixes. The following might also be helpfull (not on all unixes): # ls -l /proc/{pid}/as # prstat -a -s rss And ps can give some info as well # ps -ef | egrep -v "STIME|$LOGNAME" | sort +3 -r | head -n 15 # ps au 1.2.3 Show memory in Linux: =========================== # /usr/sbin/dmesg | grep "Physical:" # cat /proc/meminfo # free -m The ipcs, vmstat, iostat and that type of commands, are ofcourse more or less the same in Linux as they are in Solaris or AIX. 1.2.4 Show aioservers in AIX: ============================= # lsattr -El aio0 autoconfig available STATE to be configured at system restart True fastpath enable State of fast path True kprocprio 39 Server PRIORITY True maxreqs 4096 Maximum number of REQUESTS True maxservers 10 MAXIMUM number of servers per cpu True minservers 1 MINIMUM number of servers True # pstat -a | grep -c aios 20 # ps -k | grep aioserver 331962 - 0:15 aioserver 352478 - 0:14 aioserver 450644 - 0:12 aioserver 454908 - 0:10 aioserver 565292 - 0:11 aioserver 569378 - 0:10 aioserver 581660 - 0:11 aioserver 585758 - 0:17 aioserver 589856 - 0:12 aioserver 593954 - 0:15 aioserver 598052 - 0:17 aioserver 602150 - 0:12 aioserver 606248 - 0:13 aioserver 827642 - 0:14 aioserver 991288 - 0:14 aioserver 995388 - 0:11 aioserver 1007616 - 0:12 aioserver 1011766 - 0:13 aioserver 1028096 - 0:13 aioserver 1032212 - 0:13 aioserver What are aioservers in AIX5?: With IO on filesystems, for example if a database is involved, you may try to tune the number of aioservers (asynchronous IO) AIX 5L supports asynchronous I/O (AIO) for database files created both on file system partitions and on raw devices. AIO on raw devices is implemented fully into the AIX kernel, and does not require database processes to service the AIO requests. When using AIO on file systems, the kernel database processes (aioserver) control each request from the time a request is taken off the queue until it completes. The kernel database processes are also used with I/O with virtual shared disks (VSDs) and HSDs with FastPath disabled. By default, FastPath is enabled. The number of aioserver servers determines the number of AIO requests that can be executed in the system concurrently, so it is important to tune the number of aioserver processes when using file systems to store Oracle Database data files. - Use one of the following commands to set the number of servers. This applies only when using asynchronous I/O on file systems rather than raw devices: # smit aio # chdev -P -l aio0 -a maxservers='128' -a minservers='20' - To set asynchronous IO to `Available': # chdev -l aio0 -P -a autoconfig=available You need to restart the Server: # shutdown -Fr 1.2.5 aio on Linux distro's: ============================ On some Linux distro's, Oracle 9i/10g supports asynchronous I/O but it is disabled by default because some Linux distributions do not have libaio by default. For Solaris, the following configuration is not required - skip down to the section on enabling asynchronous I/O. On Linux, the Oracle binary needs to be relinked to enable asynchronous I/O. The first thing to do is shutdown the Oracle server. After Oracle has shutdown, do the following steps to relink the binary: su - oracle cd $ORACLE_HOME/rdbms/lib make -f ins_rdbms.mk async_on make -f ins_rdbms.mk ioracle 1.2.6 The ipcs and ipcrm commands: ================================== The "ipcs" command is really a "listing" command. But if you need to intervene in memory structures, like for example if you need to "clear" or remove a shared memory segment, because a faulty or crashed application left semaphores, memory identifiers, or queues in place, you can use to "ipcrm" command to remove those structures. Example ipcrm command usage: ---------------------------- Suppose an application crashed, but it cannot be started again. The following might help, if you happened to know which IPC identifier it used. Suppose the app used 47500 as the IPC key. Calcultate this decimal number to hex which is, in this example, B98C. No do the following: # ipcs -bm | grep B89C This might give you, for example, the shared memory identifier "50855977". Now clear the segment: # ipcrm -m 50855977 It might also be, that still a semaphore and/or queue is still "left over". In that case you might also try commands like the following example: ipcs -q ipcs -s # ipcrm -s 2228248 (remove semaphore) # ipcrm -q 5111883 (remove queue) Note: in some cases the "slibclean" command can be used to clear unused modules in kernel and library memory. Just give as root the command: # slibclean Other Example: -------------- If you run the following command to remove a shared memory segment and you get this error: # ipcrm -m 65537 ipcrm: 0515-020 shmid(65537) was not found. However, if you run the ipcs command, you still see the segment there: # ipcs | grep 65537 m 65537 0x00000000 DCrw------- root system If you look carefully, you will notice the "D" in the forth column. The "D" means: D If the associated shared memory segment has been removed. It disappears when the last process attached to the segment detaches it. So, to clear the shared memory segment, find the process which is still associated with the segment: # ps -ef | grep process_owner where process_owner is the name of the owner using the shared segment Now kill the process found from the ps command above # kill -9 pid Running another ipcs command will show the shared memory segment no longer exists: # ipcs | grep 65537 Example ipcrm -m 65537 1.2.7 Show patches, version, systeminfo: ======================================== Solaris: ======== showrev: -------- #showrev Displays system summary information. #showrev -p Reports which patches are installed sysdef and dmesg: ----------------- The follwing commands also displays configuration information # sysdef # dmesg versions: --------- ==> To check your Solaris version: # uname -a or uname -m # cat /etc/release # isainfo -v ==> To check your AIX version: # oslevel # oslevel -r tells you which maintenance level you have. >> To find the known recommended maintenance levels: # oslevel -rq >> To find all filesets lower than a certain maintenance level: # oslevel -rl 5200-06 >> To find all filesets higher than a certain maintenance level: # oslevel -rg 5200-05 >> To list all known recommended maintenance and technology levels on the system, type: # oslevel -q -s Known Service Packs ------------------- 5300-05-04 5300-05-03 5300-05-02 5300-05-01 5300-05-00 5300-04-CSP 5300-04-03 5300-04-02 5300-04-01 5300-03-CSP >> Example: 5300-02 is TL 02 5300-02-04 is TL 02 and SP 04 5300-02-CSP is TL 02 and CSP for TL 02 (and there won't be anymore SPs because when you see a CSP it is because the next TL has been released. In this case it would be TL 03). >> How can I determine which fileset updates are missing from a particular AIX level? To determine which fileset updates are missing from 5300-04, for example, run the following command: # oslevel -rl 5300-04 >> What SP (Service Pack) is installed on my system? To see which SP is currently installed on the system, run the oslevel -s command. Sample output for an AIX 5L Version 5.3 system, with TL4, and SP2 installed would be: # oslevel -s 5300-04-02 >> Is a CSP (Concluding Service Pack) installed on my system? To see if a CSP is currently installed on the system, run the oslevel -s command. Sample output for an AIX 5L Version 5.3 system, with TL3, and CSP installed would be: # oslevel -s 5300-03-CSP ==> To check your HP machine: # model 9000/800/rp7410 : machine info on AIX How do I find out the Chip type, System name, Node name, Model Number etc.? The uname command provides details about your system. uname -p Displays the chip type of the system. For example, powerpc. uname -r Displays the release number of the operating system. uname -s Displays the system name. For example, AIX. uname -n Displays the name of the node. uname -a Displays the system name, nodename,Version, Machine id. uname -M Displays the system model name. For example, IBM, 7046-B50. uname -v Displays the operating system version uname -m Displays the machine ID number of the hardware running the system. uname -u Displays the system ID number. Architecture: ------------- To see if you have a CHRP machine, log into the machine as the root user, and run the following command: # lscfg | grep Architecture or use: # lscfg -pl sysplanar0 | more The bootinfo -p command also shows the architecture of the pSeries, RS/6000 # bootinfo -p chrp 1.2.8 Check whether you have a 32 bit or 64 bit version: ======================================================== - Solaris: # iasinfo -vk If /usr/bin/isainfo cannot be found, then the OS only supports 32-bit process address spaces. (Solaris 7 was the first version that could run 64-bit binaries on certain SPARC-based systems.) So a ksh-based test might look something like if [ -x /usr/bin/isainfo ]; then bits=`/usr/bin/isainfo -b` else bits=32 fi - AIX: Command: /bin/lslpp -l bos.64bit ...to see if bos.64bit is installed & committed. -or- /bin/locale64 ...error message if on 32bit machine such as: Could not load program /bin/locale64: Cannot run a 64-bit program on a 32-bit machine. Or use: # bootinfo -K displays the current kernel wordsize of "32" or "64" # bootinfo -y tells if hardware is 64-bit capable # bootinfo -p If it returns the string 32 it is only capable of running the 32-bit kernel. If it returns the string chrp the machine is capable of running the 64-bit kernel or the 32-bit kernel. Or use: # /usr/bin/getconf HARDWARE_BITMODE This command should return the following output: 64 Note: ----- HOW TO CHANGE KERNEL MODE OF IBM AIX 5L (5.1) --------------------------------------------- The AIX 5L has pre-configured kernels. These are listed below for Power processors: /usr/lib/boot/unix_up 32 bit uni-processor /usr/lib/boot/unix_mp 32 bit multi-processor kernel /usr/lib/boot/unix_64 64 bit multi-processor kernel Switching between kernel modes means using different kernels. This is simply done by pointing the location that is referenced by the system to these kernels. Use symbolic links for this purpose. During boot AIX system runs the kernel in the following locations: /unix /usr/lib/boot/unix The base operating system 64-bit runtime fileset is bos.64bit. Installing bos.64bit also installs the /etc/methods/cfg64 file. The /etc/methods/cfg64 file provides the option of enabling or disabling the 64-bit environment via SMIT, which updates the /etc/inittab file with the load64bit line. (Simply adding the load64bit line does not enable the 64-bit environment). The command lslpp -l bos.64bit reveals if this fileset is installed. The bos.64bit fileset is on the AIX media; however, installing the bos.64bit fileset does not ensure that you will be able to run 64-bit software. If the bos.64bit fileset is installed on 32-bit hardware, you should be able to compile 64-bit software, but you cannot run 64-bit programs on 32-bit hardware. The syscalls64 extension must be loaded in order to run a 64-bit executable. This is done from the load64bit entry in the inittab file. You must load the syscalls64 extension even when running a 64-bit kernel on 64-bit hardware. To determine if the 64-bit kernel extension is loaded, at the command line, enter genkex |grep 64. Information similar to the following displays: 149bf58 a3ec /usr/lib/drivers/syscalls64.ext To change the kernel mode follow steps below: 1. Create symbolic link from /unix and /usr/lib/boot/unix to the location of the desired kernel. 2. Create boot image. 3. Reboot AIX. Below lists the detailed actions to change kernel mode: To change to 32 bit uni-processor mode: # ln -sf /usr/lib/boot/unix_up /unix # ln -sf /usr/lib/boot/unix_up /usr/lib/boot/unix # bosboot -ad /dev/ipldevice # shutdown -r To change to 32 bit multi-processor mode: # ln -sf /usr/lib/boot/unix_mp /unix # ln -sf /usr/lib/boot/unix_mp /usr/lib/boot/unix # bosboot -ad /dev/ipldevice # shutdown -r To change to 64 bit multi-processor mode: # ln -sf /usr/lib/boot/unix_64 /unix # ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix # bosboot -ad /dev/ipldevice # shutdown -r IMPORTANT NOTE: If you are changing the kernel mode to 32-bit and you will run 9.2 on this server, the following line should be included in /etc/inittab: load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs This allows 64-bit applications to run on the 32-bit kernel. Note that this line is also mandatory if you are using the 64-bit kernel. In AIX 5.2, the 32-bit kernel is installed by default. The 64-bit kernel, along with JFS2 (enhanced journaled file system), can be enabled at installation time. Checking if other unixes are in 32 or 64 mode: ---------------------------------------------- - Digital UNIX/Tru64: This OS is only available in 64bit form. - HP-UX(Available in 64bit starting with HP-UX 11.0): Command: /bin/getconf KERNEL_BITS ...returns either 32 or 64 - SGI: This OS is only available in 64bit form. - The remaining supported UNIX platforms are only available in 32bit form. scinstall: ---------- # scinstall -pv Displays Sun Cluster software release and package version information 1.2.9 Info about CPUs: ====================== Solaris: -------- # psrinfo -v Shows the number of processors and their status. # psrinfo -v|grep "Status of processor"|wc -l Shows number of cpu's Linux: ------ # cat /proc/cpuinfo # cat /proc/cpuinfo | grep processor|wc -l Especially with Linux, the /proc directory contains special "files" that either extract information from or send information to the kernel HP-UX: ------ # ioscan -kfnC processor # /usr/sbin/ioscan -kf | grep processor # grep processor /var/adm/syslog/syslog.log # /usr/contrib/bin/machinfo (Itanium) Several ways as, 1. sam -> performance monitor -> processor 2. print_manifest (if ignite-ux installed) 3. machinfo (11.23 HP versions) 4. ioscan -fnC processor 5. echo "processor_count/D" | adb /stand/vmunix /dev/kmem 6. top command to get cpu count The "getconf" command can give you a lot of interesting info. The parameters are: ARG_MAX _BC_BASE_MAX BC_DIM_MAX BS_SCALE_MAX BC_STRING_MAX CHARCLASS_NAME_MAX CHAR_BIT CHAR_MAX CHAR_MIN CHILD_MAX CLK_TCK COLL_WEIGHTS_MAX CPU_CHIP_TYPE CS_MACHINE_IDENT CS_PARTITION_IDENT CS_PATH CS_MACHINE_SERIAL EXPR_NEST_MAX HW_CPU_SUPP_BITS HW_32_64_CAPABLE INT_MAX INT_MIN KERNEL_BITS LINE_MAX LONG_BIT LONG_MAX LONG_MIN MACHINE_IDENT MACHINE_MODEL MACHINE_SERIAL MB_LEN_MAX NGROUPS_MAX NL_ARGMAX NL_LANGMAX NL_MSGMAX NL_NMAX NL_SETMAX NL_TEXTMAX NZERO OPEN_MAX PARTITION_IDENT PATH _POSIX_ARG_MAX _POSIX_JOB_CONTROL _POSIX_NGROUPS_MAX _POSIX_OPEN_MAX _POSIX_SAVED_IDS _POSIX_SSIZE_MAX _POSIX_STREAM_MAX _POSIX_TZNAME_MAX _POSIX_VERSION POSIX_ARG_MAX POSIX_CHILD_MAX POSIX_JOB_CONTROL POSIX_LINK_MAX POSIX_MAX_CANON POSIX_MAX_INPUT POSIX_NAME_MAX POSIX_NGROUPS_MAX POSIX_OPEN_MAX POSIX_PATH_MAX POSIX_PIPE_BUF POSIX_SAVED_IDS POSIX_SSIZE_MAX POSIX_STREAM_MAX POSIX_TZNAME_MAX POSIX_VERSION POSIX2_BC_BASE_MAX POSIX2_BC_DIM_MAX POSIX2_BC_SCALE_MAX POSIX2_BC_STRING_MAX POSIX2_C_BIND POSIX2_C_DEV POSIX2_C_VERSION POSIX2_CHAR_TERM POSIX_CHILD_MAX POSIX2_COLL_WEIGHTS_MAX POSIX2_EXPR_NEST_MAX POSIX2_FORT_DEV POSIX2_FORT_RUN POSIX2_LINE_MAX POSIX2_LOCALEDEF POSIX2_RE_DUP_MAX POSIX2_SW_DEV POSIX2_UPE POSIX2_VERSION SC_PASS_MAX SC_XOPEN_VERSION SCHAR_MAX SCHAR_MIN SHRT_MAX SHRT_MIN SSIZE_MAX Example: # getconf CPU_VERSION sample function in shell script: get_cpu_version() { case `getconf CPU_VERSION` in # ???) echo "Itanium[TM] 2" ;; 768) echo "Itanium[TM] 1" ;; 532) echo "PA-RISC 2.0" ;; 529) echo "PA-RISC 1.2" ;; 528) echo "PA-RISC 1.1" ;; 523) echo "PA-RISC 1.0" ;; *) return 1 ;; esac return 0 AIX: ---- # pmcycles -m Cpu 0 runs at 1656 MHz Cpu 1 runs at 1656 MHz Cpu 2 runs at 1656 MHz Cpu 3 runs at 1656 MHz # lscfg | grep proc More cpu information on AIX: # lsattr -El procx (where x is the number of the cpu) type powerPC_POWER5 Processor type False frequency 165600000 Processor speed False .. .. where False means that the value cannot be changed through an AIX command. # lparstat (only for latest AIX versions) # lparstat -i To view CPU scheduler tunable parameters, use the schedo command: # schedo -a In AIX 5L on Power5, you can switch from Simultaneous Multithreading SMT, or Single Threading ST, as follows (smtcl) # smtctl -m off will set SMT mode to disabled # smtctl -m on will set SMT mode to enabled # smtctl -W boot makes SMT effective on next boot # smtctl -W now effects SMT now, but will not persist across reboots When you want to keep the setting across reboots, you must use the bosboot command in order to create a new boot image. 1.2.10 Other stuff: =================== runlevel: --------- To show the init runlevel: # who -r Top users: ---------- To get a quick impression about the top 10 users in the system at this time: ps auxw | sort -r +3 |head -10 -Shows top 10 memory usage by process ps auxw | sort -r +2 |head -10 -Shows top 10 CPU usage by process More accuracy in memory usage with the ps command: ps -vg ps -vg: ------- Using "ps vg" gives a per process tally of memory usage for each running process. Several fields give memory usage in different units, but these numbers do not tell the whole story on where all the memory goes. First of all, the man page for ps does not give an accurate description of the memory related fields. Here is a better description: RSS - This tells how much RAM resident memory is currently being used for the text and data segments for a particular process in units of kilobytes. (this value will always be a multiple of 4 since memory is allocated in 4 KB pages). %MEM - This is the fraction of RSS divided by the total size of RAM for a particular process. Since RSS is some subset of the total resident memory usage for a process, the %MEM value will also be lower than actual. TRS - This tells how much RAM resident memory is currently being used for the text segment for a particular process in units of kilobytes. This will always be less than or equal to RSS. SIZE - This tells how much paging space is allocated for this process for the text and data segments in units of kilobytes. If the executable file is on a local filesystem, the page space usage for text is zero. If the executable is on an NFS filesystem, the page space usage will be nonzero. This number may be greater than RSS, or it may not, depending on how much of the process is paged in. The reason RSS can be larger is that RSS counts text whereas SIZE does not. TSIZ - This field is absolutely bogus because it is not a multiple of 4 and does not correlate to any of the other fields. These fields only report on a process text and data segments. Segment size which cannot be interrogated at this time are: Text portion of shared libraries (segment 13) Files that are in use. Open files are cached in memory as individual segments. Shared data segments created with shmat. Kernel segments such as kernel segment 0, kernel extension segments, and virtual memory management segments. In summary, ps is not a very good tool to measure system memory usage. It can give you some idea where some of the memory goes, but it leaves too many questions unanswered about the total usage. shared memory: -------------- To check shared memory segment, semaphore array, and message queue limits, issue the ipcs -l command. # ipcs The following tools are available for monitoring the performance of your UNIX-based system. pfiles: ------- /usr/proc/bin/pfiles This shows the open files for this process, which helps you diagnose whether you are having problems caused by files not getting closed. lsof: ----- This utility lists open files for running UNIX processes, like pfiles. However, lsof gives more useful information than pfiles. You can find lsof at ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/. Example of lsof usage: You can see CIO (concurrent IO) in the FILE-FLAG column if you run lsof +fg, e.g.: tarunx01:/home/abielewi:# /p570build/LSOF/lsof-4.76/usr/local/bin/lsof +fg /baanprd/oradat COMMAND PID USER FD TYPE FILE-FLAG DEVICE SIZE/OFF NODE NAME oracle 434222 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1 6701056 866 /baanprd/oradat (/dev/bprdoradat) oracle 434222 oracle 17u VREG R,W,CIO,DSYN,LG;CX 39,1 6701056 867 /baanprd/oradat (/dev/bprdoradat) oracle 442384 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1 1174413312 875 /baanprd/oradat (/dev/bprdoradat) oracle 442384 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1 734011392 877 /baanprd/oradat (/dev/bprdoradat) oracle 450814 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1 1174413312 875 /baanprd/oradat (/dev/bprdoradat) oracle 450814 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1 1814044672 876 /baanprd/oradat (/dev/bprdoradat) oracle 487666 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1 1174413312 875 /baanprd/oradat (/dev/bprdoradat You should also see O_CIO in your file open calls if you run truss, e.g.: open("/opt/oracle/rcat/oradat/redo01.log", O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 18 VMSTAT SOLARIS: --------------- # vmstat This command is ideal for monitoring paging rate, which can be found under the page in (pi) and page out (po) columns. Other important columns are the amount of allocated virtual storage (avm) and free virtual storage (fre). This command is useful for determining if something is suspended or just taking a long time. Example: kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m3 m4 in sy cs us sy id 0 0 0 2163152 1716720 157 141 1179 1 1 0 0 0 0 0 0 680 1737 855 10 3 87 0 0 0 2119080 1729352 0 1 0 0 0 0 0 0 0 1 0 345 658 346 1 1 98 0 0 0 2118960 1729232 0 167 0 0 0 0 0 0 0 0 0 402 1710 812 4 2 94 0 0 0 2112992 1723264 0 1261 0 0 0 0 0 0 0 0 0 1026 5253 1848 10 5 85 0 0 0 2112088 1722352 0 248 0 0 0 0 0 0 0 0 0 505 2822 1177 5 2 92 0 0 0 2116288 1726544 4 80 0 0 0 0 0 0 0 0 0 817 4015 1530 6 4 90 0 0 0 2117744 1727960 4 2 30 0 0 0 0 0 0 0 0 473 1421 640 2 2 97 procs/r: Run queue length. procs/b: Processes blocked while waiting for I/O. procs/w: Idle processes which have been swapped. memory/swap: Free, unreserved swap space (Kb). memory/free: Free memory (Kb). (Note that this will grow until it reaches lotsfree, at which point the page scanner is started. See "Paging" for more details.) page/re: Pages reclaimed from the free list. (If a page on the free list still contains data needed for a new request, it can be remapped.) page/mf: Minor faults (page in memory, but not mapped). (If the page is still in memory, a minor fault remaps the page. It is comparable to the vflts value reported by sar -p.) page/pi: Paged in from swap (Kb/s). (When a page is brought back from the swap device, the process will stop execution and wait. This may affect performance.) page/po: Paged out to swap (Kb/s). (The page has been written and freed. This can be the result of activity by the pageout scanner, a file close, or fsflush.) page/fr: Freed or destroyed (Kb/s). (This column reports the activity of the page scanner.) page/de: Freed after writes (Kb/s). (These pages have been freed due to a pageout.) page/sr: Scan rate (pages). Note that this number is not reported as a "rate," but as a total number of pages scanned. disk/s#: Disk activity for disk # (I/O's per second). faults/in: Interrupts (per second). faults/sy: System calls (per second). faults/cs: Context switches (per second). cpu/us: User CPU time (%). cpu/sy: Kernel CPU time (%). cpu/id: Idle + I/O wait CPU time (%). When analyzing vmstat output, there are several metrics to which you should pay attention. For example, keep an eye on the CPU run queue column. The run queue should never exceed the number of CPUs on the server. If you do notice the run queue exceeding the amount of CPUs, it's a good indication that your server has a CPU bottleneck. To get an idea of the RAM usage on your server, watch the page in (pi) and page out (po) columns of vmstat's output. By tracking common virtual memory operations such as page outs, you can infer the times that the Oracle database is performing a lot of work. Even though UNIX page ins must correlate with the vmstat's refresh rate to accurately predict RAM swapping, plotting page ins can tell you when the server is having spikes of RAM usage. Once captured, it's very easy to take the information about server performance directly from the Oracle tables and plot them in a trend graph. Rather than using an expensive statistical package such as SAS, you can use Microsoft Excel. Copy and paste the data from the tables into Excel. After that, you can use the Chart Wizard to create a line chart that will help you view server usage information and discover trends. # VMSTAT AIX: ------------- This is virtually equal to the usage of vmstat under solaris. vmstat can be used to give multiple statistics on the system. For CPU-specific work, try the following command: # vmstat -t 1 3 This will take 3 samples, 1 second apart, with timestamps (-t). You can, of course, change the parameters as you like. The output is shown below. kthr memory page faults cpu time ----- ----------- ------------------------ ------------ ----------- -------- r b avm fre re pi po fr sr cy in sy cs us sy id wa hr mi se 0 0 45483 221 0 0 0 0 1 0 224 326 362 24 7 69 0 15:10:22 0 0 45483 220 0 0 0 0 0 0 159 83 53 1 1 98 0 15:10:23 2 0 45483 220 0 0 0 0 0 0 145 115 46 0 9 90 1 15:10:24 In this output some of the things to watch for are: "avm", which is Active Virtual Memory. Ideally, under normal conditions, the largest avm value should in general be smaller than the amount of RAM. If avm is smaller than RAM, and still exessive paging occurs, that could be due to RAM being filled with file pages. avm x 4K = number of bytes Columns r (run queue) and b (blocked) start going up, especially above 10. This usually is an indication that you have too many processes competing for CPU. If cs (contact switches) go very high compared to the number of processes, then you may need to tune the system with vmtune. In the cpu section, us (user time) indicates the time is being spent in programs. Assuming Java is at the top of the list in tprof, then you need to tune the Java application). In the cpu section, if sys (system time) is higher than expected, and you still have id (idle) time left, this may indicate lock contention. Check the tprof for lock related calls in the kernel time. You may want to try multiple instances of the JVM. It may also be possible to find deadlocks in a javacore file. In the cpu section, if wa (I/O wait) is high, this may indicate a disk bottleneck, and you should use iostat and other tools to look at the disk usage. Values in the pi, po (page in/out) columns are non-zero may indicate that you are paging and need more memory. It may be possible that you have the stack size set too high for some of your JVM instances. It could also mean that you have allocated a heap larger than the amount of memory on the system. Of course, you may also have other applications using memory, or that file pages may be taking up too much of the memory Other example: -------------- # vmstat 1 System configuration: lcpu=2 mem=3920MB kthr memory page faults cpu ----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 229367 332745 0 0 0 0 0 0 3 198 69 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 3 33 66 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 2 33 68 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 80 306 100 0 1 97 1 0 0 229367 332745 0 0 0 0 0 0 1 20 68 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 2 36 64 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 2 33 66 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 2 21 66 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 1 237 64 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 2 19 66 0 0 99 0 0 0 229367 332745 0 0 0 0 0 0 6 37 76 0 0 99 0 The most important fields to look at here are: r -- The average number of runnable kernel threads over whatever sampling interval you have chosen. b -- The average number of kernel threads that are in the virtual memory waiting queue over your sampling interval. r should always be higher than b; if it is not, it usually means you have a CPU bottleneck. fre -- The size of your memory free list. Do not worry so much if the amount is really small. More importantly, determine if there is any paging going on if this amount is small. pi -- Pages paged in from paging space. po -- Pages paged out to paging space. CPU section: us sy id wa Let's look at the last section, which also comes up in most other CPU monitoring tools, albeit with different headings: us -- user time sy -- system time id -- idle time wa -- waiting on I/O # IOSTAT: --------- This command is useful for monitoring I/O activities. You can use the read and write rate to estimate the amount of time required for certain SQL operations (if they are the only activity on the system). This command is also useful for determining if something is suspended or just taking a long time. Basic synctax is iostat interval count option - let you specify the device for which information is needed like disk , cpu or terminal. (-d , -c , -t or -tdc ) . x options gives the extended statistics . interval - is time period in seconds between two samples . iostat 4 will give data at each 4 seconds interval. count - is the number of times the data is needed . iostat 4 5 will give data at 4 seconds interval 5 times. Example: $ iostat -xtc 5 2 extended disk statistics tty cpu disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0 sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23 sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31 disk name of the disk r/s reads per second w/s writes per second Kr/s kilobytes read per second Kw/s kilobytes written per second wait average number of transactions waiting for service (Q length) actv average number of transactions actively being serviced (removed from the queue but not yet completed) %w percent of time there are transactions waiting for service (queue non-empty) %b percent of time the disk is busy (transactions in progress) The values to look from the iostat output are: Reads/writes per second (r/s , w/s) Percentage busy (%b) Service time (svc_t) If a disk shows consistently high reads/writes along with , the percentage busy (%b) of the disks is greater than 5 percent, and the average service time (svc_t) is greater than 30 milliseconds, then action needs to be taken. # netstat This command lets you know the network traffic on each node, and the number of error packets encountered. It is useful for isolating network problems. Example: To find out all listening services, you can use the command # netstat -a -f inet 1.2.11 Some other utilities for Solaris: ======================================== # top For example: load averages: 0.66, 0.54, 0.56 11:14:48 187 processes: 185 sleeping, 2 on cpu CPU states: % idle, % user, % kernel, % iowait, % swap Memory: 4096M real, 1984M free, 1902M swap in use, 2038M swap free PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND 2795 oraclown 1 59 0 265M 226M sleep 0:13 4.38% oracle 2294 root 11 59 0 8616K 7672K sleep 10:54 3.94% bpbkar 13907 oraclown 11 59 0 271M 218M cpu2 4:02 2.23% oracle 14138 oraclown 12 59 0 270M 230M sleep 9:03 1.76% oracle 2797 oraclown 1 59 0 189M 151M sleep 0:01 0.96% oracle 2787 oraclown 11 59 0 191M 153M sleep 0:06 0.69% oracle 2799 oraclown 1 59 0 190M 151M sleep 0:02 0.45% oracle 2743 oraclown 11 59 0 191M 155M sleep 0:25 0.35% oracle 2011 oraclown 11 59 0 191M 149M sleep 2:50 0.27% oracle 2007 oraclown 11 59 0 191M 149M sleep 2:22 0.26% oracle 2009 oraclown 11 59 0 191M 149M sleep 1:54 0.20% oracle 2804 oraclown 1 51 0 1760K 1296K cpu2 0:00 0.19% top 2013 oraclown 11 59 0 191M 148M sleep 0:36 0.14% oracle 2035 oraclown 11 59 0 191M 149M sleep 2:44 0.13% oracle 114 root 10 59 0 5016K 4176K sleep 23:34 0.05% picld Process ID This column shows the process ID (pid) of each process. The process ID is a positive number, usually less than 65536. It is used for identification during the life of the process. Once a process has exited or been killed, the process ID can be reused. Username This column shows the name of the user who owns the process. The kernel stores this information as a uid, and top uses an appropriate table (/etc/passwd, NIS, or NIS+) to translate this uid in to a name. Threads This column displays the number of threads for the current process. This column is present only in the Solaris 2 port of top. For Solaris, this number is actually the number of lightweight processes (lwps) created by the threads package to handle the threads. Depending on current resource utilization, there may not be one lwp for every thread. Thus this number is actually less than or equal to the total number of threads created by the process. Nice This column reflects the "nice" setting of each process. A process's nice is inhereted from its parent. Most user processes run at a nice of 0, indicating normal priority. Users have the option of starting a process with a positive nice value to allow the system to reduce the priority given to that process. This is normally done for long-running cpu-bound jobs to keep them from interfering with interactive processes. The Unix command "nice" controls setting this value. Only root can set a nice value lower than the current value. Nice values can be negative. On most systems they range from -20 to 20. The nice value influences the priority value calculated by the Unix scheduler. Size This column shows the total amount of memory allocated by each process. This is virtual memory and is the sum total of the process's text area (program space), data area, and dynamically allocated area (or "break"). When a process allocates additional memory with the system call "brk", this value will increase. This is done indirectly by the C library function "malloc". The number in this column does not reflect the amount of physical memory currently in use by the process. Resident Memory This column reflects the amount of physical memory currently allocated to each process. This is also known as the "resident set size" or RSS. A process can have a large amount of virtual memory allocated (as indicated by the SIZE column) but still be using very little physical memory. Process State This column reflects the last observed state of each process. State names vary from system to system. These states are analagous to those that appear in the process states line: the second line of the display. The more common state names are listed below. cpu - Assigned to a CPU and currently running run - Currently able to run sleep - Awaiting an external event, such as input from a device stop - Stopped by a signal, as with control Z swap - Virtual address space swapped out to disk zomb - Exited, but parent has not called "wait" to receive the exit status CPU Time This column displayes the accumulated CPU time for each process. This is the amount of time that any cpu in the system has spent actually running this process. The standard format shows two digits indicating minutes, a colon, then two digits indicating seconds. For example, the display "15:32" indicates fifteen minutes and thirty-two seconds. When a time value is greater than or equal to 1000 minutes, it is displayed as hours with the suffix H. For example, the display "127.4H" indicates 127 hours plus four tenths of an hour (24 minutes). When the number of hours exceeds 999.9, the "H" suffix is dropped so that the display continues to fit in the column. CPU Percentage This column shows the percentage of the cpu that each process is currently consuming. By default, top will sort this column of the output. Some versions of Unix will track cpu percentages in the kernel, as the figure is used in the calculation of a process's priority. On those versions, top will use the figure as calculated by the kernel. Other versions of Unix do not perform this calculation, and top must determine the percentage explicity by monitoring the changes in cpu time. On most multiprocessor machines, the number displayed in this column is a percentage of the total available cpu capacity. Therefore, a single threaded process running on a four processor system will never use more than 25% of the available cpu cycles. Command This column displays the name of the executable image that each process is running. In most cases this is the base name of the file that was invoked with the most recent kernel "exec" call. On most systems, this name is maintained separately from the zeroth argument. A program that changes its zeroth argument will not affect the output of this column. # modinfo The modinfo command provides information about the modules currently loaded by the kernel. The /etc/system file: Available for Solaris Operating Environment, the /etc/system file contains definitions for kernel configuration limits such as the maximum number of users allowed on the system at a time, the maximum number of processes per user, and the inter-process communication (IPC) limits on size and number of resources. These limits are important because they affect DB2 performance on a Solaris Operating Environment machine. See the Quick Beginnings information for further details. # more /etc/path_to_inst To see the mapping between the kernel abbreviated instance name for physical device names, view the /etc/path_to_inst file. # uptime uptime - show how long the system has been up /export/home/oraclown>uptime 11:32am up 4:19, 1 user, load average: 0.40, 1.17, 0.90 1.2.12 proc toos for Solaris: ============================= The proc tools are called that way, because the retreive information fromn the /proc virtual filesystem They are: /usr/proc/bin/pflags [-r] pid... /usr/proc/bin/pcred pid... /usr/proc/bin/pmap [-rxlF] pid... /usr/proc/bin/pldd [-F] pid... /usr/proc/bin/psig pid... /usr/proc/bin/pstack [-F] pid... /usr/proc/bin/pfiles [-F] pid... /usr/proc/bin/pwdx [-F] pid... /usr/proc/bin/pstop pid... /usr/proc/bin/prun pid... /usr/proc/bin/pwait [-v] pid... /usr/proc/bin/ptree [-a] [[pid| user]...] /usr/proc/bin/ptime command [arg...] /usr/proc/bin/pattr [-x ] [pid...] /usr/proc/bin/pclear [pid...] /usr/proc/bin/plabel [pid...] /usr/proc/bin/ppriv [-a] [pid...] -- pfiles: reports all the files which are opened by a given pid -- pldd lists all the dynamic libraries linked to the process -- pwdx gives the directory from which the process is running -- ptree The ptree utility prints the process trees containing the specified pids or users, with child processes indented from their respective parent processes. An argument of all digits is taken to be a process-ID, otherwise it is assumed to be a user login name. The default is all processes. Use it like # ptree Or use it with params, which enables you to produce different listings The following example prints the process tree (including children of process 0) for processes which match the command name ssh: $ ptree -a `pgrep ssh` 1 /sbin/init 100909 /usr/lib/ssh/sshd 569150 /usr/lib/ssh/sshd 569157 /usr/lib/ssh/sshd 569159 -ksh 569171 bash 569173 /bin/ksh 569193 bash ---------------------------------------------------------------------- Remark: many Linux distros adopted the ptree command, as the "pstree" command. As in ubuntu$ pstree -pl init(1)---NetworkManager(5427) +-NetworkManagerD(5441) +-acpid(5210) +-apache2(6966)---apache2(2890) Ý +-apache2(2893) Ý +-apache2(7163) Ý +-apache2(7165) Ý +-apache2(7166) Ý +-apache2(7167) Ý +-apache2(7168) +-atd(6369) +-avahi-daemon(5658)---avahi-daemon(5659) +-bonobo-activati(7816)---{bonobo-activati}(7817) etc.. .. ------------------------------------------------------------------------ Back to Solaris again: Suppose you did a pfiles on an Apache process: # pfiles 13789 13789: /apps11i/erpdev/10GAS/Apache/Apache/bin/httpd -d /apps11i/erpdev/10G Current rlimit: 1024 file descriptors 0: S_IFIFO mode:0000 dev:350,0 ino:114723 uid:65060 gid:54032 size:301 O_RDWR 1: S_IFREG mode:0640 dev:307,28001 ino:612208 uid:65060 gid:54032 size:386 O_WRONLY|O_APPEND|O_CREAT /apps11i/erpdev/10GAS/opmn/logs/HTTP_Server~1 2: S_IFIFO mode:0000 dev:350,0 ino:143956 uid:65060 gid:54032 size:0 O_RDWR 3: S_IFREG mode:0600 dev:307,28001 ino:606387 uid:65060 gid:54032 size:1056768 O_RDWR|O_CREAT /apps11i/erpdev/10GAS/Apache/Apache/logs/mm.19389.mem 4: S_IFREG mode:0600 dev:307,28001 ino:606383 uid:65060 gid:54032 size:0 O_RDWR|O_CREAT 5: S_IFREG mode:0600 dev:307,28001 ino:621827 uid:65060 gid:54032 size:1056768 O_RDWR|O_CREAT 6: S_IFDOOR mode:0444 dev:351,0 ino:58 uid:0 gid:0 size:0 O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[421] /var/run/name_service_door 7: S_IFIFO mode:0000 dev:350,0 ino:143956 uid:65060 gid:54032 size:0 O_RDWR 8: S_IFCHR mode:0666 dev:342,0 ino:47185924 uid:0 gid:3 rdev:90,0 O_RDONLY /devices/pseudo/kstat@0:kstat etc.. .. .. O_RDWR|O_CREAT /apps11i/erpdev/10GAS/Apache/Apache/logs/dms_metrics.19389.shm.sem 21: S_IFREG mode:0600 dev:307,28001 ino:603445 uid:65060 gid:54032 size:17408 O_RDONLY FD_CLOEXEC /apps11i/erpdev/10GAS/rdbms/mesg/ocius.msb 23: S_IFSOCK mode:0666 dev:348,0 ino:60339 uid:0 gid:0 size:0 O_RDWR SOCK_STREAM SO_SNDBUF(49152),SO_RCVBUF(49152),IP_NEXTHOP(0.0.192.0) sockname: AF_INET 3.56.189.4 port: 45395 peername: AF_INET 3.56.189.4 port: 12501 256: S_IFREG mode:0444 dev:85,0 ino:234504 uid:0 gid:3 size:1616 O_RDONLY|O_LARGEFILE /etc/inet/hosts Suppose you tried pldd on the same process gave this result: # pldd 13789 13789: /apps11i/erp dev/10GAS/Apache/Apache/bin/httpd -d /apps11i/erpdev/10G /apps11i/erpdev/10GAS/lib32/libdms2.so /lib/libpthread.so.1 /lib/libsocket.so.1 /lib/libnsl.so.1 /lib/libdl.so.1 /lib/libc.so.1 /platform/sun4u-us3/lib/libc_psr.so.1 /lib/libmd5.so.1 /platform/sun4u/lib/libmd5_psr.so.1 /lib/libscf.so.1 /lib/libdoor.so.1 /lib/libuutil.so.1 /lib/libgen.so.1 /lib/libmp.so.2 /lib/libm.so.2 /lib/libresolv.so.2 /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_onsint.so /lib/librt.so.1 /apps11i/erpdev/10GAS/lib32/libons.so /lib/libkstat.so.1 /lib/libaio.so.1 /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_mmap_static.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_vhost_alias.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_env.so .. .. etc /usr/lib/libsched.so.1 /apps11i/erpdev/10GAS/lib32/libclntsh.so.10.1 /apps11i/erpdev/10GAS/lib32/libnnz10.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_wchandshake.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_oc4j.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_dms.so /apps11i/erpdev/10GAS/Apache/Apache/libexec/mod_rewrite.so /apps11i/erpdev/10GAS/Apache/oradav/lib/mod_oradav.so /apps11i/erpdev/10GAS/Apache/modplsql/bin/modplsql.so # pmap -x $$ 492328: -ksh Address Kbytes RSS Anon Locked Mode Mapped File 00010000 192 192 - - r-x-- ksh 00040000 8 8 8 - rwx-- ksh 00042000 40 40 8 - rwx-- [ heap ] FF180000 680 680 - - r-x-- libc.so.1 FF23A000 24 24 - - rwx-- libc.so.1 FF240000 8 8 8 - rwx-- libc.so.1 FF280000 576 576 - - r-x-- libnsl.so.1 FF310000 40 40 - - rwx-- libnsl.so.1 FF31A000 24 16 - - rwx-- libnsl.so.1 FF350000 16 16 - - r-x-- libmp.so.2 FF364000 8 8 - - rwx-- libmp.so.2 FF380000 40 40 - - r-x-- libsocket.so.1 FF39A000 8 8 - - rwx-- libsocket.so.1 FF3A0000 8 8 - - r-x-- libdl.so.1 FF3B0000 8 8 8 - rwx-- [ anon ] FF3C0000 152 152 - - r-x-- ld.so.1 FF3F6000 8 8 8 - rwx-- ld.so.1 FFBFC000 16 16 8 - rw--- [ stack ] -------- ------- ------- ------- ------- total Kb 1856 1848 48 - 1.2.13 Wellknown tools for AIX: =============================== 1. commands: ------------ CPU Memory Subsystem I/O Subsystem Network Subsystem --------------------------------------------------------------------------------- vmstat vmstat iostat netstat iostat lsps vmstat ifconfig ps svmon lsps tcpdump sar filemon filemon tprof ipcs lvmstat nmon and topas can be used to monitor those subsystems in general. 2. topas: --------- topas is a useful graphical interface that will give you immediate results of what is going on in the system. When you run it without any command-line arguments, the screen looks like this: Topas Monitor for host: aix4prt EVENTS/QUEUES FILE/TTY Mon Apr 16 16:16:50 2001 Interval: 2 Cswitch 5984 Readch 4864 Syscall 15776 Writech 34280 Kernel 63.1 |################## | Reads 8 Rawin 0 User 36.8 |########## | Writes 2469 Ttyout 0 Wait 0.0 | | Forks 0 Igets 0 Idle 0.0 | | Execs 0 Namei 4 Runqueue 11.5 Dirblk 0 Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 0.0 lo0 213.9 2154.2 2153.7 107.0 106.9 tr0 34.7 16.9 34.4 0.9 33.8 PAGING MEMORY Faults 3862 Real,MB 1023 Disk Busy% KBPS TPS KB-Read KB-Writ Steals 1580 % Comp 27.0 hdisk0 0.0 0.0 0.0 0.0 0.0 PgspIn 0 % Noncomp 73.9 PgspOut 0 % Client 0.5 Name PID CPU% PgSp Owner PageIn 0 java 16684 83.6 35.1 root PageOut 0 PAGING SPACE java 12192 12.7 86.2 root Sios 0 Size,MB 512 lrud 1032 2.7 0.0 root % Used 1.2 aixterm 19502 0.5 0.7 root NFS (calls/sec) % Free 98.7 topas 6908 0.5 0.8 root ServerV2 0 ksh 18148 0.0 0.7 root ClientV2 0 Press: gil 1806 0.0 0.0 root ServerV3 0 "h" for help The information on the bottom left side shows the most active processes; here, java is consuming 83.6% of CPU. The middle right area shows the total physical memory (1 GB in this case) and Paging space (512 MB), as well as the amount being used. So you get an excellent overview of what the system is doing in a single screen, and then you can select the areas to concentrate based on the information being shown here. Note: about waits: ------------------ Don't get caught up in this whole wait i/o thing. a single cpu system with 1 i/o outstanding and no other runable threads (i.e. idle) will have 100% wait i/o. There was a big discussion a couple of years ago on removing the kernel tick as it has confused many many many techs. So, if you have only 1 or few cpu, then you are going to have high wait i.o figures, it does not neccessarily mean your disk subsystem is slow. 3. trace: --------- trace captures a sequential flow of time-stamped system events. The trace is a valuable tool for observing system and application execution. While many of the other tools provide high level statistics such as CPU and I/O utilization, the trace facility helps expand the information as to where the events happened, which process is responsible, when the events took place, and how they are affecting the system. Two post processing tools that can extract information from the trace are utld (in AIX 4) and curt (in AIX 5). These provide statistics on CPU utilization and process/thread activity. The third post processing tool is splat which stands for Simple Performance Lock Analysis Tool. This tool is used to analyze lock activity in the AIX kernel and kernel extension for simple locks. 4. nmon: -------- nmon is a free software tool that gives much of the same information as topas, but saves the information to a file in Lotus 123 and Excel format. The download site is http://www.ibm.com/developerworks/eserver/articles/analyze_aix/. The information that is collected included CPU, disk, network, adapter statistics, kernel counters, memory and the "top" process information. 5. tprof: --------- tprof is one of the AIX legacy tools that provides a detailed profile of CPU usage for every AIX process ID and name. It has been completely rewritten for AIX 5.2, and the example below uses the AIX 5.1 syntax. You should refer to AIX 5.2 Performance Tools update: Part 3 for the new syntax. The simplest way to invoke this command is to use: # tprof -kse -x "sleep 10" # tprof -ske -x "sleep 30" At the end of ten seconds, or 30 seconds, a new file __prof.all, or sleep.prof, is generated that contains information about what commands are using CPU on the system. Searching for FREQ, the information looks something like the example below: Process FREQ Total Kernel User Shared Other ======= === ===== ====== ==== ====== ===== oracle 244 10635 3515 6897 223 0 java 247 3970 617 0 2062 1291 wait 16 1515 1515 0 0 0 ... ======= === ===== ====== ==== ====== ===== Total 1060 19577 7947 7252 3087 1291 This example shows that over half the CPU time is associated with the oracle application and that Java is using about 3970/19577 or 1/5 of the CPU. The wait usually means idle time, but can also include the I/O wait portion of the CPU usage. svmon: ------ The svmon command captures a snapshot of the current state om memory. use it with the -G switch to get global statistics for the whole system. svmon is the most useful tool at your disposal when monitoring a Java process, especially native heap. The article "When segments collide" gives examples of how to use svmon -P -m to monitor the native heap of a Java process on AIX. But there is another variation, svmon -P -m -r, that is very effective in identifying native heap fragmentation. The -r switch prints the address range in use, so it gives a more accurate view of how much of each segment is in use. As an example, look at the partially edited output below: Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage 10556 java 681613 2316 2461 501080 N Y N Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual 22ac4 9 mmap mapped to sid b1475 - 0 0 - - 21047 8 mmap mapped to sid 30fe5 - 0 0 - - 126a2 a mmap mapped to sid 91072 - 0 0 - - 7908c 7 mmap mapped to sid 6bced - 0 0 - - b2ad6 b mmap mapped to sid b1035 - 0 0 - - b1475 - work - 65536 0 282 65536 30fe5 - work - 65536 0 285 65536 91072 - work - 65536 0 54 65536 6bced - work - 65536 0 261 65536 b1035 - work - 45054 0 0 45054 Addr Range: 0..45055 e0f9f 5 work shmat/mmap - 48284 0 3 48284 19100 3 work shmat/mmap - 46997 0 463 47210 c965a 4 work shmat/mmap - 46835 0 281 46953 7910c 6 work shmat/mmap - 37070 0 0 37070 Addr Range: 0..50453 e801d d work shared library text - 9172 0 0 9220 Addr Range: 0..30861 a0fb7 f work shared library data - 105 0 1 106 Addr Range: 0..2521 21127 2 work process private - 50 2 1 51 Addr Range: 65300..65535 a8535 1 pers code,/dev/q109waslv:81938 - 11 0 - - Addr Range: 0..11 Other example: # svmon -G -i 2 5 # sample five times at two second intervals memory in use pin pg space size inuse free pin work pers clnt work pers clnt size inuse 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674 In this example, there are 16384 pages of total size of memory. Multuply this number by 4096 to see the total real memory size. In this case the total memory is 64 MB. filemon: -------- filemon can be used to identify the files that are being used most actively. This tool gives a very comprehensive view of file access, and can be useful for drilling down once vmstat/iostat confirm disk to be a bottleneck. Example: # filemon -o /tmp/filemon.log; sleep 60; trcstop The generated log file is quite large. Some sections that may be useful are: Most Active Files ------------------------------------------------------------------------ #MBs #opns #rds #wrs file volume:inode ------------------------------------------------------------------------ 25.7 83 6589 0 unix /dev/hd2:147514 16.3 1 4175 0 vxe102 /dev/mailv1:581 16.3 1 0 4173 .vxe102.pop /dev/poboxv:62 15.8 1 1 4044 tst1 /dev/mailt1:904 8.3 2117 2327 0 passwd /dev/hd4:8205 3.2 182 810 1 services /dev/hd4:8652 ... ------------------------------------------------------------------------ Detailed File Stats ------------------------------------------------------------------------ FILE: /var/spool/mail/v/vxe102 volume: /dev/mailv1 (/var/spool2/mail/v) inode: 581 opens: 1 total bytes xfrd: 17100800 reads: 4175 (0 errs) read sizes (bytes): avg 4096.0 min 4096 max 4096 sdev 0.0 read times (msec): avg 0.543 min 0.011 max 78.060 sdev 2.753 ... curt: ----- curt Command Purpose The CPU Utilization Reporting Tool (curt) command converts an AIX trace file into a number of statistics related to CPU utilization and either process, thread or pthread activity. These statistics ease the tracking of specific application activity. curt works with both uniprocessor and multiprocessor AIX Version 4 and AIX Version 5 traces. Syntax curt -i inputfile [-o outputfile] [-n gennamesfile] [-m trcnmfile] [-a pidnamefile] [-f timestamp] [-l timestamp] [-ehpstP] Description The curt command takes an AIX trace file as input and produces a number of statistics related to processor (CPU) utilization and process/thread/pthread activity. It will work with both uniprocessor and multiprocessor AIX traces if the processor clocks are properly synchronized. genkld: ------- genkld Command Purpose The genkld command extracts the list of shared objects currently loaded onto the system and displays the address, size, and path name for each object on the list. Syntax genkld Description For shared objects loaded onto the system, the kernel maintains a linked list consisting of data structures called loader entries. A loader entry contains the name of the object, its starting address, and its size. This information is gathered and reported by the genkld command. Implementation Specifics This command is valid only on the POWER-based platform. Examples To obtain a list of loaded shared objects, enter: # genkld .. d0791c00 18ab27 /usr/lib/librtl.a[shr.o] d0194500 7e07 /usr/lib/libbsd.a[shr.o] d019d0f8 3d39 /usr/lib/libbind.a[shr.o] d0237100 1eac0 /usr/lib/libwlm.a[shr.o] d01d5100 1fff9 /usr/lib/libC.a[shr.o] d02109e0 262b2 /usr/lib/libC.a[shrcore.o] d01f6c60 190dc /usr/lib/libC.a[ansicore_32.o] d01b0000 24cfd /usr/lib/boot/bin/libcfg_chrp d010a000 367ad /usr/lib/libpthreads.a[shr_xpg5.o] d0142000 3cee /usr/lib/libpthreads.a[shr_comm.o] d017f100 1172a /usr/lib/libcfg.a[shr.o] d016c100 128b2 /usr/lib/libodm.a[shr.o] d014c100 b12d /usr/lib/libi18n.a[shr.o] d0158100 13b41 /usr/lib/libiconv.a[shr4.o] d01410f8 846 /usr/lib/libcrypt.a[shr.o] .. etc.. 1.2.14 Not so well known tools for AIX: the proc tools: ======================================================= --proctree Displays the process tree containing the specified process IDs or users. To display the ancestors and all the children of process 12312, enter: # proctree 21166 11238 /usr/sbin/srcmstr 21166 /usr/sbin/rsct/bin/IBM.AuditRMd To display the ancestors and children of process 21166, including children of process 0, enter: #proctree -a 21166 1 /etc/init 11238 /usr/sbin/srcmstr 21166 /usr/sbin/rsct/bin/IBM.AuditRMd -- procstack Displays the hexadecimal addresses and symbolic names for each of the stack frames of the current thread in processes. To display the current stack of process 15052, enter: # procstack 15052 15052 : /usr/sbin/snmpd d025ab80 select (?, ?, ?, ?, ?) + 90 100015f4 main (?, ?, ?) + 1814 10000128 __start () + 8c Currently, procstack displays garbage or wrong information for the top stack frame, and possibly for the second top stack frame. Sometimes it will erroneously display "No frames found on the stack," and sometimes it will display: deadbeef ???????? (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ...) The fix for this problem had not been released at the writing of this article. When the fix becomes available, you need to download the APAR IY48543 for 5.2. For AIX 5.3 it all should work OK. -- procmap Displays a process address map. To display the address space of process 13204, enter: # procmap 13204 13204 : /usr/sbin/biod 6 10000000 3K read/exec biod 20000910 0K read/write biod d0083100 79K read/exec /usr/lib/libiconv.a 20013bf0 41K read/write /usr/lib/libiconv.a d007a100 34K read/exec /usr/lib/libi18n.a 20011378 4K read/write /usr/lib/libi18n.a d0074000 11K read/exec /usr/lib/nls/loc/en_US d0077130 8K read/write /usr/lib/nls/loc/en_US d00730f8 2K read/exec /usr/lib/libcrypt.a f03c7508 0K read/write /usr/lib/libcrypt.a d01d4e20 1997K read/exec /usr/lib/libc.a f0337e90 570K read/write /usr/lib/libc.a -- procldd Displays a list of libraries loaded by a process. To display the list of dynamic libraries loaded by process 11928, enter # procldd 11928. T 11928 : -sh /usr/lib/nls/loc/en_US /usr/lib/libcrypt.a /usr/lib/libc.a -- procflags Displays a process tracing flags, and the pending and holding signals. To display the tracing flags of process 28138, enter: # procflags 28138 28138 : /usr/sbin/rsct/bin/IBM.HostRMd data model = _ILP32 flags = PR_FORK /64763: flags = PR_ASLEEP | PR_NOREGS /66315: flags = PR_ASLEEP | PR_NOREGS /60641: flags = PR_ASLEEP | PR_NOREGS /66827: flags = PR_ASLEEP | PR_NOREGS /7515: flags = PR_ASLEEP | PR_NOREGS /70439: flags = PR_ASLEEP | PR_NOREGS /66061: flags = PR_ASLEEP | PR_NOREGS /69149: flags = PR_ASLEEP | PR_NOREGS -- procsig Lists the signal actions for a process. To list all the signal actions defined for process 30552, enter: # procsig 30552 30552 : -ksh HUP caught INT caught QUIT caught ILL caught TRAP caught ABRT caught EMT caught FPE caught KILL default RESTART BUS caught -- proccred Prints a process' credentials. To display the credentials of process 25632, enter: # proccred 25632 25632: e/r/suid=0 e/r/sgid=0 -- procfiles Prints a list of open file descriptors. To display status and control information on the file descriptors opened by process 20138, enter: # procfiles -n 20138 20138 : /usr/sbin/rsct/bin/IBM.CSMAgentRMd Current rlimit: 2147483647 file descriptors 0: S_IFCHR mode:00 dev:10,4 ino:4178 uid:0 gid:0 rdev:2,2 O_RDWR name:/dev/null 2: S_IFREG mode:0311 dev:10,6 ino:250 uid:0 gid:0 rdev:0,0 O_RDWR size:0 name:/var/ct/IBM.CSMAgentRM.stderr 4: S_IFREG mode:0200 dev:10,6 ino:255 uid:0 gid:0 rdev:0,0 -- procwdx Prints the current working directory for a process. To display the current working directory of process 11928, enter: # procwdx 11928 11928 : /home/guest -- procstop Stops a process. To stop process 7500 on the PR_REQUESTED event, enter: # procstop 7500 . -- procrun Restart a process. To restart process 30192 that was stopped on the PR_REQUESTED event, enter: # procrun 30192 . -- procwait Waits for all of the specified processes to terminate. To wait for process 12942 to exit and display the status, enter # procwait -v 12942 . 12942 : terminated, exit status 0 1.2.15 Other monitoring: ======================== Nagios: open source Monitoring for most unix systems: ----------------------------------------------------- Nagios is an open source host, service and network monitoring program. Latest versions: 2.5 (stable) Overview Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser. System Requirements The only requirement of running Nagios is a machine running Linux (or UNIX variant) and a C compiler. You will probably also want to have TCP/IP configured, as most service checks will be performed over the network. You are not required to use the CGIs included with Nagios. However, if you do decide to use them, you will need to have the following software installed... - A web server (preferrably Apache) - Thomas Boutell's gd library version 1.6.3 or higher (required by the statusmap and trends CGIs) rstat: Monitoring Machine Utilization with rstat: ------------------------------------------------- rstat stands for Remote System Statistics service Ports exist for most unixes, like Linux, Solaris, AIX etc.. -- rstat on Linux, Solaris: rstat is an RPC client program to get and print statistics from any machine running the rpc.rstatd daemon, its server-side counterpart. The rpc.rstad daemon has been used for many years by tools such as Sun's perfmeter and the rup command. The rstat program is simply a new client for an old daemon. The fact that the rpc.rstatd daemon is already installed and running on most Solaris and Linux machines is a huge advantage over other tools that require the installation of custom agents. The rstat client compiles and runs on Solaris and Linux as well and can get statistics from any machine running a current rpc.rstatd daemon, such as Solaris, Linux, AIX, and OpenBSD. The rpc.rstatd daemon is started from /etc/inetd.conf on Solaris. It is similar to vmstat, but has some advantages over vmstat: You can get statistics without logging in to the remote machine, including over the Internet. It includes a timestamp. The output can be plotted directly by gnuplot. The fact that it runs remotely means that you can use a single central machine to monitor the performance of many remote machines. It also has a disadvantage in that it does not give the useful scan rate measurement of memory shortage, the sr column in vmstat. rstat will not work across most firewalls because it relies on port 111, the RPC port, which is usually blocked by firewalls. To use rstat, simply give it the name or IP address of the machine you wish to monitor. Remember that rpc.rstatd must be running on that machine. The rup command is extremely useful here because with no arguments, it simply prints out a list of all machines on the local network that are running the rstatd demon. If a machine is not listed, you may have to start rstatd manually. To start rpc.rstatd under Red Hat Linux, run # /etc/rc.d/init.d/rstatd start as root. On Solaris, first try running the rstat client because inetd is often already configured to automatically start rpc.rstatd on request. If it the client fails with the error "RPC: Program not registered," make sure you have this line in your /etc/inet/inetd.conf and kill -HUP your inetd process to get it to re-read inetd.conf, as follows: rstatd/2-4 tli rpc/datagram_v wait root /usr/lib/netsvc/rstat/rpc.rstatd rpc.rstatd Then you can monitor that machine like this: % rstat enkidu 2001 07 10 10 36 08 0 0 0 100 0 27 54 1 0 0 12 0.1 This command will give you a one-second average and then it will exit. If you want to continuously monitor, give an interval in seconds on the command line. Here's an example of one line of output every two seconds: % rstat enkidu 2 2001 07 10 10 36 28 0 0 1 98 0 0 7 2 0 0 61 0.0 2001 07 10 10 36 30 0 0 0 100 0 0 0 2 0 0 15 0.0 2001 07 10 10 36 32 0 0 0 100 0 0 0 2 0 0 15 0.0 2001 07 10 10 36 34 0 0 0 100 0 5 10 2 0 0 19 0.0 2001 07 10 10 36 36 0 0 0 100 0 0 46 2 0 0 108 0.0 ^C To get a usage message, the output format, the version number, and where to go for updates, just type rstat with no parameters: % rstat usage: rstat machine [interval] output: yyyy mm dd hh mm ss usr wio sys idl pgin pgout intr ipkts opkts coll cs load docs and src at http://patrick.net/software/rstat/rstat.html Notice that the column headings line up with the output data. -- AIX: In order to get rstat working on AIX, you may need to configure rstatd. As root 1. Edit /etc/inetd.conf Uncomment or add entry for rstatd Eg rstatd sunrpc_udp udp wait root /usr/sbin/rpc.rstatd rstatd 100001 1-3 2. Edit /etc/services Uncomment or add entry for rstatd Eg rstatd 100001/udp 3. Refresh services refresh -s inetd 4. Start rstatd /usr/sbin/rpc.rstatd 1.2.16 UNIX ERROR CODES: ======================== It's always "handy" to have a list of errcodes from the errno.h headerfile. It should be reasonable the same accross the unix versions. Actually, this is only a very small list of errors and code. It is ONLY associated with the interaction of a process with the system. For example, the errors can be seen at boottime of a system, or what an error logging daemon might write in a logfile, is a very different story. from the errno.h file: >>> Errcodes Linux (generic): #define EPERM 1 /* Operation not permitted */ #define ENOENT 2 /* No such file or directory */ #define ESRCH 3 /* No such process */ #define EINTR 4 /* Interrupted system call */ #define EIO 5 /* I/O error */ #define ENXIO 6 /* No such device or address */ #define E2BIG 7 /* Arg list too long */ #define ENOEXEC 8 /* Exec format error */ #define EBADF 9 /* Bad file number */ #define ECHILD 10 /* No child processes */ #define EAGAIN 11 /* Try again */ #define ENOMEM 12 /* Out of memory */ #define EACCES 13 /* Permission denied */ #define EFAULT 14 /* Bad address */ #define ENOTBLK 15 /* Block device required */ #define EBUSY 16 /* Device or resource busy */ #define EEXIST 17 /* File exists */ #define EXDEV 18 /* Cross-device link */ #define ENODEV 19 /* No such device */ #define ENOTDIR 20 /* Not a directory */ #define EISDIR 21 /* Is a directory */ #define EINVAL 22 /* Invalid argument */ #define ENFILE 23 /* File table overflow */ #define EMFILE 24 /* Too many open files */ #define ENOTTY 25 /* Not a typewriter */ #define ETXTBSY 26 /* Text file busy */ #define EFBIG 27 /* File too large */ #define ENOSPC 28 /* No space left on device */ #define ESPIPE 29 /* Illegal seek */ #define EROFS 30 /* Read-only file system */ #define EMLINK 31 /* Too many links */ #define EPIPE 32 /* Broken pipe */ #define EDOM 33 /* Math argument out of domain of func */ #define ERANGE 34 /* Math result not representable */ #define EDEADLK 35 /* Resource deadlock would occur */ #define ENAMETOOLONG 36 /* File name too long */ #define ENOLCK 37 /* No record locks available */ #define ENOSYS 38 /* Function not implemented */ #define ENOTEMPTY 39 /* Directory not empty */ #define ELOOP 40 /* Too many symbolic links encountered */ #define EWOULDBLOCK EAGAIN /* Operation would block */ #define ENOMSG 42 /* No message of desired type */ #define EIDRM 43 /* Identifier removed */ #define ECHRNG 44 /* Channel number out of range */ #define EL2NSYNC 45 /* Level 2 not synchronized */ #define EL3HLT 46 /* Level 3 halted */ #define EL3RST 47 /* Level 3 reset */ #define ELNRNG 48 /* Link number out of range */ #define EUNATCH 49 /* Protocol driver not attached */ #define ENOCSI 50 /* No CSI structure available */ #define EL2HLT 51 /* Level 2 halted */ #define EBADE 52 /* Invalid exchange */ #define EBADR 53 /* Invalid request descriptor */ #define EXFULL 54 /* Exchange full */ #define ENOANO 55 /* No anode */ #define EBADRQC 56 /* Invalid request code */ #define EBADSLT 57 /* Invalid slot */ #define EDEADLOCK EDEADLK #define EBFONT 59 /* Bad font file format */ #define ENOSTR 60 /* Device not a stream */ #define ENODATA 61 /* No data available */ #define ETIME 62 /* Timer expired */ #define ENOSR 63 /* Out of streams resources */ #define ENONET 64 /* Machine is not on the network */ #define ENOPKG 65 /* Package not installed */ #define EREMOTE 66 /* Object is remote */ #define ENOLINK 67 /* Link has been severed */ #define EADV 68 /* Advertise error */ #define ESRMNT 69 /* Srmount error */ #define ECOMM 70 /* Communication error on send */ #define EPROTO 71 /* Protocol error */ #define EMULTIHOP 72 /* Multihop attempted */ #define EDOTDOT 73 /* RFS specific error */ #define EBADMSG 74 /* Not a data message */ #define EOVERFLOW 75 /* Value too large for defined data type */ #define ENOTUNIQ 76 /* Name not unique on network */ #define EBADFD 77 /* File descriptor in bad state */ #define EREMCHG 78 /* Remote address changed */ #define ELIBACC 79 /* Can not access a needed shared library */ #define ELIBBAD 80 /* Accessing a corrupted shared library */ #define ELIBSCN 81 /* .lib section in a.out corrupted */ #define ELIBMAX 82 /* Attempting to link in too many shared libraries */ #define ELIBEXEC 83 /* Cannot exec a shared library directly */ #define EILSEQ 84 /* Illegal byte sequence */ #define ERESTART 85 /* Interrupted system call should be restarted */ #define ESTRPIPE 86 /* Streams pipe error */ #define EUSERS 87 /* Too many users */ #define ENOTSOCK 88 /* Socket operation on non-socket */ #define EDESTADDRREQ 89 /* Destination address required */ #define EMSGSIZE 90 /* Message too long */ #define EPROTOTYPE 91 /* Protocol wrong type for socket */ #define ENOPROTOOPT 92 /* Protocol not available */ #define EPROTONOSUPPORT 93 /* Protocol not supported */ #define ESOCKTNOSUPPORT 94 /* Socket type not supported */ #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */ #define EPFNOSUPPORT 96 /* Protocol family not supported */ #define EAFNOSUPPORT 97 /* Address family not supported by protocol */ #define EADDRINUSE 98 /* Address already in use */ #define EADDRNOTAVAIL 99 /* Cannot assign requested address */ #define ENETDOWN 100 /* Network is down */ #define ENETUNREACH 101 /* Network is unreachable */ #define ENETRESET 102 /* Network dropped connection because of reset */ #define ECONNABORTED 103 /* Software caused connection abort */ #define ECONNRESET 104 /* Connection reset by peer */ #define ENOBUFS 105 /* No buffer space available */ #define EISCONN 106 /* Transport endpoint is already connected */ #define ENOTCONN 107 /* Transport endpoint is not connected */ #define ESHUTDOWN 108 /* Cannot send after transport endpoint shutdown */ #define ETOOMANYREFS 109 /* Too many references: cannot splice */ #define ETIMEDOUT 110 /* Connection timed out */ #define ECONNREFUSED 111 /* Connection refused */ #define EHOSTDOWN 112 /* Host is down */ #define EHOSTUNREACH 113 /* No route to host */ #define EALREADY 114 /* Operation already in progress */ #define EINPROGRESS 115 /* Operation now in progress */ #define ESTALE 116 /* Stale NFS file handle */ #define EUCLEAN 117 /* Structure needs cleaning */ #define ENOTNAM 118 /* Not a XENIX named type file */ #define ENAVAIL 119 /* No XENIX semaphores available */ #define EISNAM 120 /* Is a named type file */ #define EREMOTEIO 121 /* Remote I/O error */ #define EDQUOT 122 /* Quota exceeded */ #define ENOMEDIUM 123 /* No medium found */ #define EMEDIUMTYPE 124 /* Wrong medium type */ The list above should actually be enough, but we shall list the same for AIX: >>> errcodes AIX: #define EPERM 1 /* Operation not permitted */ #define ENOENT 2 /* No such file or directory */ #define ESRCH 3 /* No such process */ #define EINTR 4 /* interrupted system call */ #define EIO 5 /* I/O error */ #define ENXIO 6 /* No such device or address */ #define E2BIG 7 /* Arg list too long */ #define ENOEXEC 8 /* Exec format error */ #define EBADF 9 /* Bad file descriptor */ #define ECHILD 10 /* No child processes */ #define EAGAIN 11 /* Resource temporarily unavailable */ #define ENOMEM 12 /* Not enough space */ #define EACCES 13 /* Permission denied */ #define EFAULT 14 /* Bad address */ #define ENOTBLK 15 /* Block device required */ #define EBUSY 16 /* Resource busy */ #define EEXIST 17 /* File exists */ #define EXDEV 18 /* Improper link */ #define ENODEV 19 /* No such device */ #define ENOTDIR 20 /* Not a directory */ #define EISDIR 21 /* Is a directory */ #define EINVAL 22 /* Invalid argument */ #define ENFILE 23 /* Too many open files in system */ #define EMFILE 24 /* Too many open files */ #define ENOTTY 25 /* Inappropriate I/O control operation */ #define ETXTBSY 26 /* Text file busy */ #define EFBIG 27 /* File too large */ #define ENOSPC 28 /* No space left on device */ #define ESPIPE 29 /* Invalid seek */ #define EROFS 30 /* Read only file system */ #define EMLINK 31 /* Too many links */ #define EPIPE 32 /* Broken pipe */ #define EDOM 33 /* Domain error within math function */ #define ERANGE 34 /* Result too large */ #define ENOMSG 35 /* No message of desired type */ #define EIDRM 36 /* Identifier removed */ #define ECHRNG 37 /* Channel number out of range */ #define EL2NSYNC 38 /* Level 2 not synchronized */ #define EL3HLT 39 /* Level 3 halted */ #define EL3RST 40 /* Level 3 reset */ #define ELNRNG 41 /* Link number out of range */ #define EUNATCH 42 /* Protocol driver not attached */ #define ENOCSI 43 /* No CSI structure available */ #define EL2HLT 44 /* Level 2 halted */ #define EDEADLK 45 /* Resource deadlock avoided */ #define ENOTREADY 46 /* Device not ready */ #define EWRPROTECT 47 /* Write-protected media */ #define EFORMAT 48 /* Unformatted media */ #define ENOLCK 49 /* No locks available */ #define ENOCONNECT 50 /* no connection */ #define ESTALE 52 /* no filesystem */ #define EDIST 53 /* old, currently unused AIX errno*/ #define EINPROGRESS 55 /* Operation now in progress */ #define EALREADY 56 /* Operation already in progress */ #define ENOTSOCK 57 /* Socket operation on non-socket */ #define EDESTADDRREQ 58 /* Destination address required */ #define EDESTADDREQ EDESTADDRREQ /* Destination address required */ #define EMSGSIZE 59 /* Message too long */ #define EPROTOTYPE 60 /* Protocol wrong type for socket */ #define ENOPROTOOPT 61 /* Protocol not available */ #define EPROTONOSUPPORT 62 /* Protocol not supported */ #define ESOCKTNOSUPPORT 63 /* Socket type not supported */ #define EOPNOTSUPP 64 /* Operation not supported on socket */ #define EPFNOSUPPORT 65 /* Protocol family not supported */ #define EAFNOSUPPORT 66 /* Address family not supported by protocol family */ #define EADDRINUSE 67 /* Address already in use */ #define EADDRNOTAVAIL 68 /* Can't assign requested address */ #define ENETDOWN 69 /* Network is down */ #define ENETUNREACH 70 /* Network is unreachable */ #define ENETRESET 71 /* Network dropped connection on reset */ #define ECONNABORTED 72 /* Software caused connection abort */ #define ECONNRESET 73 /* Connection reset by peer */ #define ENOBUFS 74 /* No buffer space available */ #define EISCONN 75 /* Socket is already connected */ #define ENOTCONN 76 /* Socket is not connected */ #define ESHUTDOWN 77 /* Can't send after socket shutdown */ #define ETIMEDOUT 78 /* Connection timed out */ #define ECONNREFUSED 79 /* Connection refused */ #define EHOSTDOWN 80 /* Host is down */ #define EHOSTUNREACH 81 /* No route to host */ #define ERESTART 82 /* restart the system call */ #define EPROCLIM 83 /* Too many processes */ #define EUSERS 84 /* Too many users */ #define ELOOP 85 /* Too many levels of symbolic links */ #define ENAMETOOLONG 86 /* File name too long */ #define EDQUOT 88 /* Disc quota exceeded */ #define ECORRUPT 89 /* Invalid file system control data */ #define EREMOTE 93 /* Item is not local to host */ #define ENOSYS 109 /* Function not implemented POSIX */ #define EMEDIA 110 /* media surface error */ #define ESOFT 111 /* I/O completed, but needs relocation */ #define ENOATTR 112 /* no attribute found */ #define ESAD 113 /* security authentication denied */ #define ENOTRUST 114 /* not a trusted program */ #define ETOOMANYREFS 115 /* Too many references: can't splice */ #define EILSEQ 116 /* Invalid wide character */ #define ECANCELED 117 /* asynchronous i/o cancelled */ #define ENOSR 118 /* temp out of streams resources */ #define ETIME 119 /* I_STR ioctl timed out */ #define EBADMSG 120 /* wrong message type at stream head */ #define EPROTO 121 /* STREAMS protocol error */ #define ENODATA 122 /* no message ready at stream head */ #define ENOSTR 123 /* fd is not a stream */ #define ECLONEME ERESTART /* this is the way we clone a stream ... */ #define ENOTSUP 124 /* POSIX threads unsupported value */ #define EMULTIHOP 125 /* multihop is not allowed */ #define ENOLINK 126 /* the link has been severed */ #define EOVERFLOW 127 /* value too large to be stored in data type */ ================================== 2. NFS and Mount command examples: ================================== Let's start with something that might be of interrest right now: Examples of mounting a DVD or CDROM: =================================== AIX: ---- # mount -r -v cdrfs /dev/cd0 /cdrom Solaris: -------- # mount -r -F hsfs /dev/dsk/c0t6d0s2 /cdrom HPUX: ----- mount -F cdfs -o rr /dev/dsk/c1t2d0 /cdrom SuSE Linux: ----------- # mount -t iso9660 /dev/cdrom /cdrom # mount -t iso9660 /dev/cdrom /media/cdrom Redhat Linux: ------------- # mount -t iso9660 /dev/cdrom /media/cdrom Other commands on Linux: ------------------------ Sometimes on some Linux, and some scsi CDROM devices, you might try # mount /dev/sr0 /mount_point # mount -t iso9660 /dev/sr0 /mount_point Now we return to a discussion of "mounting" and NFS. 2.1 NFS: ======== We will discuss the most important feaures of NFS, by showing how its implemented on Solaris, Redhat and SuSE Linux. Most of this applies to HP-UX and AIX as well. 2.1.1 NFS and Redhat Linux: --------------------------- Linux uses a combination of kernel-level support and continuously running daemon processes to provide NFS file sharing, however, NFS support must be enabled in the Linux kernel to function. NFS uses Remote Procedure Calls (RPC) to route requests between clients and servers, meaning that the portmap service must be enabled and active at the proper runlevels for NFS communication to occur. Working with portmap, various other processes ensure that a particular NFS connection is allowed and may proceed without error: rpc.mountd - The running process that receives the mount request from an NFS client and checks to see if it matches with a currently exported file system. rpc.nfsd - The process that implements the user-level part of the NFS service. It works with the Linux kernel to meet the dynamic demands of NFS clients, such as providing additional server threads for NFS clients to uses. rpc.lockd - A daemon that is not necessary with modern kernels. NFS file locking is now done by the kernel. It is included with the nfs-utils package for users of older kernels that do not include this functionality by default. rpc.statd - Implements the Network Status Monitor (NSM) RPC protocol. This provides reboot notification when an NFS server is restarted without being gracefully brought down. rpc.rquotad - An RPC server that provides user quota information for remote users. Not all of these programs are required for NFS service. The only services that must be enabled are rpc.mountd, rpc.nfsd, and portmap. The other daemons provide additional functionality and should only be used if your server environment requires them. NFS version 2 uses the User Datagram Protocol (UDP) to provide a stateless network connection between the client and server. NFS version 3 can use UDP or TCP running over an IP. The stateless UDP connection minimizes network traffic, as the NFS server sends the client a cookie after the client is authorized to access the shared volume. This cookie is a random value stored on the server's side and is passed with along with RPC requests from the client. The NFS server can be restarted without affecting the clients and the cookie will remain intact. NFS only performs authentication when a client system attempts to mount a remote file system. To limit access, the NFS server first employs TCP wrappers. TCP wrappers reads the /etc/hosts.allow and /etc/hosts.deny files to determine if a particular client should be permitted or prevented access to the NFS server. After the client is allowed past TCP wrappers, the NFS server refers to its configuration file, "/etc/exports", to determine whether the client has enough privileges to mount any of the exported file systems. After granting access, any file and directory operations are sent to the server using remote procedure calls. Warning NFS mount privileges are granted specifically to a client, not a user. If you grant a client machine access to an exported file system, any users of that machine will have access to the data. When configuring the /etc/exports file, be extremely careful about granting read-write permissions (rw) to a remote host. -- NFS and portmap NFS relies upon remote procedure calls (RPC) to function. portmap is required to map RPC requests to the correct services. RPC processes notify portmap when they start, revealing the port number they are monitoring and the RPC program numbers they expect to serve. The client system then contacts portmap on the server with a particular RPC program number. portmap then redirects the client to the proper port number to communicate with its intended service. Because RPC-based services rely on portmap to make all connections with incoming client requests, portmap must be available before any of these services start. If, for some reason, the portmap service unexpectedly quits, restart portmap and any services running when it was started. The portmap service can be used with the host access files (/etc/hosts.allow and /etc/hosts.deny) to control which remote systems are permitted to use RPC-based services on your machine. Access control rules for portmap will affect all RPC-based services. Alternatively, you can specify each of the NFS RPC daemons to be affected by a particular access control rule. The man pages for rpc.mountd and rpc.statd contain information regarding the precise syntax of these rules. -- portmap Status As portmap provides the coordination between RPC services and the port numbers used to communicate with them, it is useful to be able to get a picture of the current RPC services using portmap when troubleshooting. The rpcinfo command shows each RPC-based service with its port number, RPC program number, version, and IP protocol type (TCP or UDP). To make sure the proper NFS RPC-based services are enabled for portmap, rpcinfo -p can be useful: # rpcinfo -p program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100024 1 udp 1024 status 100024 1 tcp 1024 status 100011 1 udp 819 rquotad 100011 2 udp 819 rquotad 100005 1 udp 1027 mountd 100005 1 tcp 1106 mountd 100005 2 udp 1027 mountd 100005 2 tcp 1106 mountd 100005 3 udp 1027 mountd 100005 3 tcp 1106 mountd 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100021 1 udp 1028 nlockmgr 100021 3 udp 1028 nlockmgr 100021 4 udp 1028 nlockmgr The -p option probes the portmapper on the specified host or defaults to localhost if no specific host is listed. Other options are available from the rpcinfo man page. From the output above, various NFS services can be seen running. If one of the NFS services does not start up correctly, portmap will be unable to map RPC requests from clients for that service to the correct port. In many cases, restarting NFS as root (/sbin/service nfs restart) will cause those service to correctly register with portmap and begin working. # /sbin/service nfs restart -- NFS Server Configuration Files Configuring a system to share files and directories using NFS is straightforward. Every file system being exported to remote users via NFS, as well as the access rights relating to those file systems, is located in the /etc/exports file. This file is read by the exportfs command to give rpc.mountd and rpc.nfsd the information necessary to allow the remote mounting of a file system by an authorized host. The exportfs command allows you to selectively export or unexport directories without restarting the various NFS services. When exportfs is passed the proper options, the file systems to be exported are written to /var/lib/nfs/xtab. Since rpc.mountd refers to the xtab file when deciding access privileges to a file system, changes to the list of exported file systems take effect immediately. Various options are available when using exportfs: -r - Causes all directories listed in /etc/exports to be exported by constructing a new export list in /etc/lib/nfs/xtab. This option effectively refreshes the export list with any changes that have been made to /etc/exports. -a - Causes all directories to be exported or unexported, depending on the other options passed to exportfs. -o options - Allows the user to specify directories to be exported that are not listed in /etc/exports. These additional file system shares must be written in the same way they are specified in /etc/exports. This option is used to test an exported file system before adding it permanently to the list of file systems to be exported. -i - Tells exportfs to ignore /etc/exports; only options given from the command line are used to define exported file systems. -u - Unexports directories from being mounted by remote users. The command exportfs -ua effectively suspends NFS file sharing while keeping the various NFS daemons up. To allow NFS sharing to continue, type exportfs -r. -v - Verbose operation, where the file systems being exported or unexported are displayed in greater detail when the exportfs command is executed. If no options are passed to the exportfs command, it displays a list of currently exported file systems. Changes to /etc/exports can also be read by reloading the NFS service with the service nfs reload command. This keeps the NFS daemons running while re-exporting the /etc/exports file. -- /etc/exports The /etc/exports file is the standard for controlling which file systems are exported to which hosts, as well as specifying particular options that control everything. Blank lines are ignored, comments can be made using #, and long lines can be wrapped with a backslash (\). Each exported file system should be on its own line. Lists of authorized hosts placed after an exported file system must be separated by space characters. Options for each of the hosts must be placed in parentheses directly after the host identifier, without any spaces separating the host and the first parenthesis. In its simplest form, /etc/exports only needs to know the directory to be exported and the hosts permitted to use it: /some/directory bob.domain.com /another/exported/directory 192.168.0.3 n5111sviob After re-exporting /etc/exports with the "/sbin/service nfs reload" command, the bob.domain.com host will be able to mount /some/directory and 192.168.0.3 can mount /another/exported/directory. Because no options are specified in this example, several default NFS preferences take effect. In order to override these defaults, you must specify an option that takes its place. For example, if you do not specify rw, then that export will only be shared read-only. Each default for every exported file system must be explicitly overridden. Additionally, other options are available where no default value is in place. These include the ability to disable sub-tree checking, allow access from insecure ports, and allow insecure file locks (necessary for certain early NFS client implementations). See the exports man page for details on these lesser used options. When specifying hostnames, you can use the following methods: single host - Where one particular host is specified with a fully qualified domain name, hostname, or IP address. wildcards - Where a * or ? character is used to take into account a grouping of fully qualified domain names that match a particular string of letters. Wildcards are not to be used with IP addresses; however, they may accidently work if reverse DNS lookups fail. However, be careful when using wildcards with fully qualified domain names, as they tend to be more exact than you would expect. For example, the use of *.domain.com as wildcard will allow sales.domain.com to access the exported file system, but not bob.sales.domain.com. To match both possibilities, as well as sam.corp.domain.com, you would have to provide *.domain.com *.*.domain.com. IP networks - Allows the matching of hosts based on their IP addresses within a larger network. For example, 192.168.0.0/28 will allow the first 16 IP addresses, from 192.168.0.0 to 192.168.0.15, to access the exported file system but not 192.168.0.16 and higher. netgroups - Permits an NIS netgroup name, written as @, to be used. This effectively puts the NIS server in charge of access control for this exported file system, where users can be added and removed from an NIS group without affecting /etc/exports. Warning The way in which the /etc/exports file is formatted is very important, particularly concerning the use of space characters. Remember to always separate exported file systems from hosts and hosts from one another with a space character. However, there should be no other space characters in the file unless they are used in comment lines. For example, the following two lines do not mean the same thing: /home bob.domain.com(rw) /home bob.domain.com (rw) The first line allows only users from bob.domain.com read-write access to the /home directory. The second line allows users from bob.domain.com to mount the directory read-only (the default), but the rest of the world can mount it read-write. Be careful where space characters are used in /etc/exports. -- NFS Client Configuration Files - What to do on a client? Any NFS share made available by a server can be mounted using various methods. Of course, the share can be manually mounted, using the mount command, to acquire the exported file system at a particular mount point. However, this requires that the root user type the mount command every time the system restarts. In addition, the root user must remember to unmount the file system when shutting down the machine. Two methods of configuring NFS mounts include modifying the /etc/fstab or using the autofs service. > /etc/fstab Placing a properly formatted line in the /etc/fstab file has the same effect as manually mounting the exported file system. The /etc/fstab file is read by the /etc/rc.d/init.d/netfs script at system startup. The proper file system mounts, including NFS, are put into place. A sample /etc/fstab line to mount an NFS export looks like the following: : nfs 0 0 The relates to the hostname, IP address, or fully qualified domain name of the server exporting the file system. The tells the server what export to mount. The specifies where on the local file system to mount the exported directory. This mount point must exist before /etc/fstab is read or the mount will fail. The nfs option specifies the type of file system being mounted. The area specifies how the file system is to be mounted. For example, if the options area states rw,suid on a particular mount, the exported file system will be mounted read-write and the user and group ID set by the server will be used. Note, parentheses are not to be used here. 2.1.2 NFS and SuSE Linux: ------------------------- -- Importing File Systems with YaST Any user authorized to do so can mount NFS directories from an NFS server into his own file tree. This can be achieved most easily using the YaST module `NFS Client'. Just enter the host name of the NFS server, the directory to import, and the mount point at which to mount this directory locally. All this is done after clicking `Add' in the first dialog. -- Importing File Systems Manually File systems can easily be imported manually from an NFS server. The only prerequisite is a running RPC port mapper, which can be started by entering the command # rcportmap start as root. Once this prerequisite is met, remote file systems exported on the respective machines can be mounted in the file system just like local hard disks using the command mount with the following syntax: # mount host:remote-path local-path If user directories from the machine sun, for example, should be imported, the following command can be used: # mount sun:/home /home -- Exporting File Systems with YaST With YaST, turn a host in your network into an NFS server - a server that exports directories and files to all hosts granted access to it. This could be done to provide applications to all coworkers of a group without installing them locally on each and every host. To install such a server, start YaST and select `Network Services' -> `NFS Server' Next, activate `Start NFS Server' and click `Next'. In the upper text field, enter the directories to export. Below, enter the hosts that should have access to them. There are four options that can be set for each host: single host, netgroups, wildcards, and IP networks. A more thorough explanation of these options is provided by man exports. `Exit' completes the configuration. -- Exporting File Systems Manually If you do not want to use YaST, make sure the following systems run on the NFS server: RPC portmapper (portmap) RPC mount daemon (rpc.mountd) RPC NFS daemon (rpc.nfsd) For these services to be started by the scripts "/etc/init.d/portmap" and "/etc/init.d/nfsserver" when the system is booted, enter the commands # insserv /etc/init.d/nfsserver and # insserv /etc/init.d/portmap. Also define which file systems should be exported to which host in the configuration file "/etc/exports". For each directory to export, one line is needed to set which machines may access that directory with what permissions. All subdirectories of this directory are automatically exported as well. Authorized machines are usually specified with their full names (including domain name), but it is possible to use wild cards like * or ? (which expand the same way as in the Bash shell). If no machine is specified here, any machine is allowed to import this file system with the given permissions. Set permissions for the file system to export in brackets after the machine name. The most important options are: ro File system is exported with read-only permission (default). rw File system is exported with read-write permission. root_squash This makes sure the user root of the given machine does not have root permissions on this file system. This is achieved by assigning user ID 65534 to users with user ID 0 (root). This user ID should be set to nobody (which is the default). no_root_squash Does not assign user ID 0 to user ID 65534, keeping the root permissions valid. link_relative Converts absolute links (those beginning with /) to a sequence of ../. This is only useful if the entire file system of a machine is mounted (default). link_absolute Symbolic links remain untouched. map_identity User IDs are exactly the same on both client and server (default). map_daemon Client and server do not have matching user IDs. This tells nfsd to create a conversion table for user IDs. The ugidd daemon is required for this to work. /etc/exports is read by mountd and nfsd. If you change anything in this file, restart mountd and nfsd for your changes to take effect. This can easily be done with "rcnfsserver restart". Example SuSE /etc/exports # # /etc/exports # /home sun(rw) venus(rw) /usr/X11 sun(ro) venus(ro) /usr/lib/texmf sun(ro) venus(rw) / earth(ro,root_squash) /home/ftp (ro) # End of exports 2.2 Mount command: ================== The standard form of the mount command, is mount -F typefs device mountdir (solaris, HP-UX) mount -t typefs device mountdir (many other unix's) This tells the kernel to attach the file system found on "device" (which is of type type) at the directory "dir". The previous contents (if any) and owner and mode of dir become invisible, and as long as this file system remains mounted, the pathname dir refers to the root of the file system on device. The syntax is: mount [options] [type] [device] [mountpoint] -- mounting a remote filesystem: syntax: mount -F nfs <-o specific options> -O : # mount -F nfs hpsrv:/data /data # mount -F nfs -o hard,intr thor:/data /data - standard mounts are determined by files like /etc/fstab (HP-UX) or /etc/filesystems (AIX) or /etc/vfstab etc.. 2.2.1 Where are the standard mounts defined? ============================================ In Solaris: =========== - standard mounts are determined by /etc/vfstab etc.. - NFS mounts are determined by the file /etc/dfs/dfstab. Here you will find share commands. - currently mounted filesystems are listed in /etc/mnttab In Linux: ========= - standard mounts are determined by most Linux distros by "/etc/fstab". In AIX: ======= - standard mounts and properties are determined by the file "/etc/filesystems". In HP-UX: ========= There is a /etc/fstab which contains all of the filesystems are mounted at boot time. The filesystems that are OS related are / , /var, /opt , /tmp, /usr , /stand The filesystem that is special is /stand, this is where your kernel is built and resides. Notice that the filesystem type is "hfs". HPUX kernels MUST reside on an hfs filesystem An example of /etc/vfstab: -------------------------- starboss:/etc $ more vfstab #device device mount FS fsck mount mount #to mount to fsck point type pass at boot options # fd - /dev/fd fd - no - /proc - /proc proc - no - /dev/md/dsk/d1 - - swap - no - /dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no logging /dev/md/dsk/d4 /dev/md/rdsk/d4 /usr ufs 1 no logging /dev/md/dsk/d3 /dev/md/rdsk/d3 /var ufs 1 no logging /dev/md/dsk/d7 /dev/md/rdsk/d7 /export ufs 2 yes logging /dev/md/dsk/d5 /dev/md/rdsk/d5 /usr/local ufs 2 yes logging /dev/dsk/c2t0d0s0 /dev/rdsk/c2t0d0s0 /export2 ufs 2 yes logging swap - /tmp tmpfs - yes size=512m mount adds an entry, umount deletes an entry. mounting applies to local filesystemes, or remote filesystems via NFS Local mount example: mount -F ufs -o logging /dev/dsk/c0t0d0s3 /mnt At Remote server: share, shareall, or add entry in /etc/dfs/dfstab # share -F nfs /var/mail Unmount a mounted FS First check who is using it # fuser -c mountpoint # umount mointpoint 2.2.2 Mounting a NFS filesystem in HP-UX: ========================================= Mounting Remote File Systems You can use either SAM or the mount command to mount file systems located on a remote system. Before you can mount file systems located on a remote system, NFS software must be installed and configured on both local and remote systems. Refer to Installing and Administering NFS for information. For information on mounting NFS file systems using SAM, see SAM's online help. To mount a remote file system using HP-UX commands, You must know the name of the host machine and the file system's directory on the remote machine. Establish communication over a network between the local system (that is, the "client") and the remote system. (The local system must be able to reach the remote system via whatever hosts database is in use.) (See named(1M) and hosts(4).) If necessary, test the connection with /usr/sbin/ping; see ping(1M). Make sure the file /etc/exports on the remote system lists the file systems that you wish to make available to clients (that is, to "export") and the local systems that you wish to mount the file systems. For example, to allow machines called rolf and egbert to remotely mount the /usr file system, edit the file /etc/exports on the remote machine and include the line: /usr rolf egbert Execute /usr/sbin/exportfs -a on the remote system to export all directories in /etc/exports to clients. For more information, see exportfs(1M). NOTE: If you wish to invoke exportfs -a at boot time, make sure the NFS configuration file /etc/rc.config.d/nfsconf on the remote system contains the following settings: NFS_SERVER=1 and START_MOUNTD=1. The client's /etc/rc.config.d/nfsconf file must contain NFS_CLIENT=1. Then issue the following command to run the script: /sbin/init.d/nfs.server start Mount the file system on the local system, as in: # mount -F nfs remotehost:/remote_dir /local_dir Just a bunch of mount command examples: --------------------------------------- # mount # mount -a # mountall -l # mount -t type device dir # mount -F pcfs /dev/dsk/c0t0d0p0:c /pcfs/c # mount /dev/md/dsk/d7 /u01 # mount sun:/home /home # mount -t nfs 137.82.51.1:/share/sunos/local /usr/local # mount /dev/fd0 /mnt/floppy # mount -o ro /dev/dsk/c0t6d0s1 /mnt/cdrom # mount -V cdrfs -o ro /dev/cd0 /cdrom 2.2.3 Solaris mount command: ============================ The unix mount command is used to mount a filesystem, and it attaches disks, and directories logically rather than physically. It takes a minimum of two arguments: 1) the name of the special device which contains the filesystem 2) the name of an existing directory on which to mount the file system Once the file system is mounted, the directory becomes the mount point. All the file systems will now be usable as if they were subdirectories of the file system they were mounted on. The table of currently mounted file systems can be found by examining the mounted file system information file. This is provided by a file system that is usually mounted on /etc/mnttab. Mounting a file system causes three actions to occur: 1. The superblock for the mounted file system is read into memory 2. An entry is made in the /etc/mnttab file 3. An entry is made in the inode for the directory on which the file system is mounted which marks the directory as a mount point The /etc/mountall command mounts all filesystems as described in the /etc/vfstab file. Note that /etc/mount and /etc/mountall commands can only be executed by the superuser. OPTIONS -F FSType Used to specify the FSType on which to operate. The FSType must be specified or must be determinable from /etc/vfstab, or by consulting /etc/default/fs or /etc/dfs/fstypes. -a [ mount_points. . . ] Perform mount or umount operations in parallel, when possible. If mount points are not specified, mount will mount all file systems whose /etc/vfstab "mount at boot" field is "yes". If mount points are specified, then /etc/vfstab "mount at boot" field will be ignored. If mount points are specified, umount will only umount those mount points. If none is specified, then umount will attempt to unmount all file systems in /etc/mnttab, with the exception of certain system required file systems: /, /usr, /var, /var/adm, /var/run, /proc, /dev/fd and /tmp. -f Forcibly unmount a file system. Without this option, umount does not allow a file system to be unmounted if a file on the file system is busy. Using this option can cause data loss for open files; programs which access files after the file sys- tem has been unmounted will get an error (EIO). -p Print the list of mounted file systems in the /etc/vfstab format. Must be the only option specified. -v Print the list of mounted file systems in verbose format. Must be the only option specified. -V Echo the complete command line, but do not execute the command. umount generates a command line by using the options and arguments provided by the user and adding to them information derived from /etc/mnttab. This option should be used to verify and validate the command line. generic_options Options that are commonly supported by most FSType-specific command modules. The following options are available: -m Mount the file system without making an entry in /etc/mnttab. -g Globally mount the file system. On a clustered system, this globally mounts the file system on all nodes of the cluster. On a non-clustered system this has no effect. -o Specify FSType-specific options in a comma separated (without spaces) list of suboptions and keyword-attribute pairs for interpretation by the FSType-specific module of the command. (See mount_ufs(1M)) -O Overlay mount. Allow the file system to be mounted over an existing mount point, making the underlying file system inaccessible. If a mount is attempted on a pre-existing mount point without setting this flag, the mount will fail, producing the error "device busy". -r Mount the file system read-only. Example mount: mount -F ufs -o logging /dev/dsk/c0t0d0s3 /mnt Example mountpoints and disks: ------------------------------ Mountpunt Device Omvang Doel / /dev/md/dsk/d1 100 Unix Root-filesysteem /usr /dev/md/dsk/d3 1200 Unix usr-filesysteem /var /dev/md/dsk/d4 200 Unix var-filesysteem /home /dev/md/dsk/d5 200 Unix opt-filesysteem /opt /dev/md/dsk/d6 4700 Oracle_Home /u01 /dev/md/dsk/d7 8700 Oracle datafiles /u02 /dev/md/dsk/d8 8700 Oracle datafiles /u03 /dev/md/dsk/d9 8700 Oracle datafiles /u04 /dev/md/dsk/d10 8700 Oracle datafiles /u05 /dev/md/dsk/d110 8700 Oracle datafiles /u06 /dev/md/dsk/d120 8700 Oracle datafiles /u07 /dev/md/dsk/d123 8650 Oracle datafiles Suppose you have only 1 disk of about 72GB, 2GB RAM: Entire disk= Slice 2 / Slice 0, partition about 2G swap Slice 1, partition about 4G /export Slice 3, partition about 50G, maybe you link it to /u01 /var Slice 4, partition about 2G /opt Slice 5, partition about 10G if you plan to install apps here /usr Slice 6, partition about 2G /u01 Slice 7, partition optional, standard it's /home Depending on how you configure /export, size could be around 20G find . -name dfctowdk\*.zip | while read file; do pkzip25 -extract -translate=unix -> 2.2.4 mount command on AIX: =========================== Typical examples: # mount -o soft 10.32.66.75:/data/nim /mnt # mount -o soft abcsrv:/data/nim /mnt # mount -o soft n580l03:/data/nim /mnt Note 1: ------- mount [ -f ] [ -n Node ] [ -o Options ] [ -p ] [ -r ] [ -v VfsName ] [ -t Type | [ Device | Node:Directory ] Directory | all | -a ] [-V [generic_options] special_mount_points If you specify only the Directory parameter, the mount command takes it to be the name of the directory or file on which a file system, directory, or file is usually mounted (as defined in the /etc/filesystems file). The mount command looks up the associated device, directory, or file and mounts it. This is the most convenient way of using the mount command, because it does not require you to remember what is normally mounted on a directory or file. You can also specify only the device. In this case, the command obtains the mount point from the /etc/filesystems file. The /etc/filesystems file should include a stanza for each mountable file system, directory, or file. This stanza should specify at least the name of the file system and either the device on which it resides or the directory name. If the stanza includes a mount attribute, the mount command uses the associated values. It recognizes five values for the mount attributes: automatic, true, false, removable, and readonly. The mount all command causes all file systems with the mount=true attribute to be mounted in their normal places. This command is typically used during system initialization, and the corresponding mounts are referred to as automatic mounts. Example mount command on AIX: ----------------------------- $ mount node mounted mounted over vfs date options -------- --------------- --------------- ------ ------------ --------------- /dev/hd4 / jfs2 Jun 06 17:15 rw,log=/dev/hd8 /dev/hd2 /usr jfs2 Jun 06 17:15 rw,log=/dev/hd8 /dev/hd9var /var jfs2 Jun 06 17:15 rw,log=/dev/hd8 /dev/hd3 /tmp jfs2 Jun 06 17:15 rw,log=/dev/hd8 /dev/hd1 /home jfs2 Jun 06 17:16 rw,log=/dev/hd8 /proc /proc procfs Jun 06 17:16 rw /dev/hd10opt /opt jfs2 Jun 06 17:16 rw,log=/dev/hd8 /dev/fslv00 /XmRec jfs2 Jun 06 17:16 rw,log=/dev/hd8 /dev/fslv01 /tmp/m2 jfs2 Jun 06 17:16 rw,log=/dev/hd8 /dev/fslv02 /software jfs2 Jun 06 17:16 rw,log=/dev/hd8 /dev/oralv /opt/app/oracle jfs2 Jun 06 17:25 rw,log=/dev/hd8 /dev/db2lv /db2_database jfs2 Jun 06 19:54 rw,log=/dev/loglv00 /dev/fslv03 /bmc_home jfs2 Jun 07 12:11 rw,log=/dev/hd8 /dev/homepeter /home/peter jfs2 Jun 13 18:42 rw,log=/dev/hd8 /dev/bmclv /bcict/stage jfs2 Jun 15 15:21 rw,log=/dev/hd8 /dev/u01 /u01 jfs2 Jun 22 00:22 rw,log=/dev/loglv01 /dev/u02 /u02 jfs2 Jun 22 00:22 rw,log=/dev/loglv01 /dev/u05 /u05 jfs2 Jun 22 00:22 rw,log=/dev/loglv01 /dev/u03 /u03 jfs2 Jun 22 00:22 rw,log=/dev/loglv01 /dev/backuo /backup_ora jfs2 Jun 22 00:22 rw,log=/dev/loglv02 /dev/u02back /u02back jfs2 Jun 22 00:22 rw,log=/dev/loglv03 /dev/u01back /u01back jfs2 Jun 22 00:22 rw,log=/dev/loglv03 /dev/u05back /u05back jfs2 Jun 22 00:22 rw,log=/dev/loglv03 /dev/u04back /u04back jfs2 Jun 22 00:22 rw,log=/dev/loglv03 /dev/u03back /u03back jfs2 Jun 22 00:22 rw,log=/dev/loglv03 /dev/u04 /u04 jfs2 Jun 22 10:25 rw,log=/dev/loglv01 Example /etc/filesystems file: /var: dev = /dev/hd9var vfs = jfs2 log = /dev/hd8 mount = automatic check = false type = bootfs vol = /var free = false /tmp: dev = /dev/hd3 vfs = jfs2 log = /dev/hd8 mount = automatic check = false vol = /tmp free = false /opt: dev = /dev/hd10opt vfs = jfs2 log = /dev/hd8 mount = true check = true vol = /opt free = false Example of the relation of Logigal Volumes and mountpoints: /dev/lv01 = /u01 /dev/lv02 = /u02 /dev/lv03 = /u03 /dev/lv04 = /data /dev/lv00 = /spl 2.2.5 Some other commands related to mounts: =========================================== fsstat command: --------------- On some unixes, the fsstat command is available. It provides filesystem statitstics. It can take a lot of switches, thus be sure to check the man pages. On Solaris, the following example shows the statistics for each file operation for "/" (using the -f option): $ fsstat -f / Mountpoint: / operation #ops bytes open 8.54K close 9.8K read 43.6K 65.9M write 1.57K 2.99M ioctl 2.06K setfl 4 getattr 40.3K setattr 38 access 9.19K lookup 203K create 595 remove 56 link 0 rename 9 mkdir 19 rmdir 0 readdir 2.02K 2.27M symlink 4 readlink 8.31K fsync 199 inactive 2.96K fid 0 rwlock 47.2K rwunlock 47.2K seek 29.1K cmp 42.9K frlock 4.45K space 8 realvp 3.25K getpage 104K putpage 2.69K map 13.2K addmap 34.4K delmap 33.4K poll 287 dump 0 pathconf 54 pageio 0 dumpctl 0 dispose 23.8K getsecattr 697 setsecattr 0 shrlock 0 vnevent 0 fuser command: -------------- AIX: Purpose Identifies processes using a file or file structure. Syntax fuser [ -c | -d | -f ] [ -k ] [ -u ] [ -x ] [ -V ]File ... Description The fuser command lists the process numbers of local processes that use the local or remote files specified by the File parameter. For block special devices, the command lists the processes that use any file on that device. Flags -c Reports on any open files in the file system containing File. -d Implies the use of the -c and -x flags. Reports on any open files which haved been unlinked from the file system (deleted from the parent directory). When used in conjunction with the -V flag, it also reports the inode number and size of the deleted file. -f Reports on open instances of File only. -k Sends the SIGKILL signal to each local process. Only the root user can kill a process of another user. -u Provides the login name for local processes in parentheses after the process number. -V Provides verbose output. -x Used in conjunction with -c or -f, reports on executable and loadable objects in addition to the standard fuser output. To list the process numbers of local processes using the /etc/passwd file, enter: # fuser /etc/passwd To list the process numbers and user login names of processes using the /etc/filesystems file, enter: # fuser -u /etc/filesystems To terminate all of the processes using a given file system, enter: #fuser -k -x -u /dev/hd1 -OR- #fuser -kxuc /home Either command lists the process number and user name, and then terminates each process that is using the /dev/hd1 (/home) file system. Only the root user can terminate processes that belong to another user. You might want to use this command if you are trying to unmount the /dev/hd1 file system and a process that is accessing the /dev/hd1 file system prevents this. To list all processes that are using a file which has been deleted from a given file system, enter: # fuser -d /usr Examples on linux distro's: - To kill all processes accessing the file system /home in any way. # fuser -km /home - invokes something if no other process is using /dev/ttyS1. if fuser -s /dev/ttyS1; then :; else something; fi - shows all processes at the (local) TELNET port. # fuser telnet/tcp A similar command is the lsof command. 2.2.6 Starting and stopping NFS: ================================ Short note on stopping and starting NFS. See other sections for more detail. On all unixes, a number of daemons should be running in order for NFS to be functional, like for example the rpc.* processes, biod, nfsd and others. Once nfs is running, and in order to actually "share" or "export" your filesystem on your server, so remote clients are able to mount the nfs mount, in most cases you should edit the "/etc/exports" file. See other sections in this document (search on exportfs) on how to accomplish this. -- AIX: The following subsystems are part of the nfs group: nfsd, biod, rpc.lockd, rpc.statd, and rpc.mountd. The nfs subsystem (group) is under control of the "resource controller", so starting and stopping nfs is actually easy # startsrc -g nfs # stopsrc -g nfs Or use smitty. -- Redhat Linux: # /sbin/service nfs restart # /sbin/service nfs start # /sbin/service nfs stop -- On some other Linux distros # /etc/init.d/nfs start # /etc/init.d/nfs stop # /etc/init.d/nfs restart -- Solaris: If the nfs daemons aren't running, then you will need to run: # /etc/init.d/nfs.server start -- HP-UX: Issue the following command on the NFS server to start all the necessary NFS processes (HP): # /sbin/init.d/nfs.server start Or if your machine is only a client: # cd /sbin/init.d # ./nfs.client start =========================================== 3. Change ownership file/dir, adding users: =========================================== 3.1 Changing ownership: ----------------------- chown -R user[:group] file/dir (SVR4) chown -R user[.group] file/dir (bsd) (-R recursive dirs) Examples: chown -R oracle:oinstall /opt/u01 chown -R oracle:oinstall /opt/u02 chown -R oracle:oinstall /opt/u03 chown -R oracle:oinstall /opt/u04 -R means all subdirs also. chown rjanssen file.txt - Give permissions as owner to user rjanssen. # groupadd dba # useradd oracle # mkdir /usr/oracle # mkdir /usr/oracle/9.0 # chown -R oracle:dba /usr/oracle # touch /etc/oratab # chown oracle:dba /etc/oratab Note: Not owner message: ------------------------ >>> Solaris: it is possible to turn the chown command on or off (i.e., allow it to be used or disallow its use) on a system by altering the /etc/system file. The /etc/system file, along with the files in /etc/default should be thought of a "system policy files" -- files that allow the systems administrator to determine such things as whether root can login over the network, whether su commands are logged, and whether a regular user can change ownership of his own files. On a system disallowing a user to change ownership of his files (this is now the default), the value of rstchown is set to 1. Think of this as saying "restrict chown is set to TRUE". You might see a line like this in /etc/system (or no rstchown value at all): set rstchown=1 On a system allowing chown by regular users, this value will be set to 0 as shown here: set rstchown=0 Whenever the /etc/system file is changed, the system will have to be rebooted for the changes to take effect. Since there is no daemon process associated with commands such a chown, there is no process that one could send a hangup (HUP) to effect the change in policy "on the fly". Why might system administrators restrict access to the chown command? For a system on which disk quotas are enforced, they might not want to allow files to be "assigned" by one user to another user's quota. More importantly, for a system on which accountability is deemed important, system administrators will want to know who created each file on a system - whether to track down a potential system abuse or simply to ask if a file that is occupying space in a shared directory or in /tmp can be removed. When a system disallows use of the chown command, you can expect to see dialog like this: % chown wallace myfile chown: xyz: Not owner Though it would be possible to disallow "chowning" of files by changing permissions on /usr/bin/chown, such a change would not slow down most Unix users. They would simple copy the /usr/bin/chown file to their own directory and make their copy executable. Designed to be extensible, Unix will happily comply. Making the change in the /etc/system file blocks any chown operation from taking effect, regardless of where the executable is stored, who owns it, and what it is called. If usage of chown is restricted in /etc/system, only the superuser can change ownership of files. 3.2 Add a user in Solaris: -------------------------- Examples: # useradd -u 3000 -g other -d /export/home/tempusr -m -s /bin/ksh -c "temporary user" tempusr # useradd -u 1002 -g dba -d /export/home/avdsel -m -s /bin/ksh -c "Albert van der Sel" avdsel # useradd -u 1001 -g oinstall -G dba -d /export/home/oraclown -m -s /bin/ksh -c "Oracle owner" oraclown # useradd -u 1005 -g oinstall -G dba -d /export/home/brighta -m -s /bin/ksh -c "Bright Alley" brighta useradd -u 300 -g staff -G staff -d /home/emc -m -s /usr/bin/ksh -c "EMC user" emc a password cannot be specified using the useradd command. Use passwd to give the user a password: # passwd tempusr UID must be unique and is typically a number between 100 and 60002 GID is a number between 0 and 60002 Or use the graphical "admintool" or smc, the solaris management console. -- Profiles a user can use to set the environment: 1. Korn Shell ksh: ------------------ When the POSIX or Korn Shell is your login shell, it looks for these following files and executes them, if they exist: /etc/profile This default system file is executed by the shell program and sets up default environment variables. .profile If this file exists in your home directory, it is executed next at login. At any time-this includes login time-the POSIX or Korn Shell is invoked, it looks for the file referenced by the following shell variable, and executes it, if it exists: ENV When you invoke the shell, it looks for a shell variable called ENV which is usually set in your .profile. ENV is evaluated and if it is set to an existing file, that file is executed. By convention, ENV is usually set to .kshrc but may be set to any file name. These files provide the means for customizing the shell environment to fit your needs. 2. Bourne Shell sh: ------------------- it looks for these following files and executes them, if they exist: /etc/profile .profle in the home directory, for example "/home/user1/.profile" 3.3 Add a user in AIX: ---------------------- You can also use the useradd command, just as in Solaris. Or use the native "mkuser" command. # mkuser albert The mkuser command does not create password information for a user. It initializes the password field with an * (asterisk). Later, this field is set with the passwd or pwdadm command. New accounts are disabled until the passwd or pwdadm commands are used to add authentication information to the /etc/security/passwd file. You can use the Users application in Web-based System Manager to change user characteristics. You could also use the System Management Interface Tool (SMIT) "smit mkuser" fast path to run this command. The /usr/lib/security/mkuser.default file contains the default attributes for new users. This file is an ASCII file that contains user stanzas. These stanzas have attribute default values for users created by the mkuser command. Each attribute has the Attribute=Value form. If an attribute has a value of $USER, the mkuser command substitutes the name of the user. The end of each attribute pair and stanza is marked by a new-line character. There are two stanzas, user and admin, that can contain all defined attributes except the id and admin attributes. The mkuser command generates a unique id attribute. The admin attribute depends on whether the -a flag is used with the mkuser command. A typical user stanza looks like the following: user: pgroup = staff groups = staff shell = /usr/bin/ksh home = /home/$USER auth1 = SYSTEM # mkuser [ -de | -sr ] [-attr Attributes=Value [ Attribute=Value... ] ] Name # mkuser [ -R load_module ] [ -a ] [ Attribute=Value ... ] Name To create the davis user account with the default values in the /usr/lib/security/mkuser.default file, type: # mkuser davis To create the davis account with davis as an administrator, type: # mkuser -a davis Only the root user or users with the UserAdmin authorization can create davis as an administrative user. To create the davis user account and set the su attribute to a value of false, type: # mkuser su=false davis To create the davis user account that is identified and authenticated through the LDAP load module, type: # mkuser -R LDAP davis To add davis to the groups finance and accounting, enter: chuser groups=finance,accounting davis -- Add a user with the smit utility: -- --------------------------------- Start SMIT by entering smit From the Main Menu, make the following selections: -Security and Users -Users -Add a User to the System The utility displays a form for adding new user information. Use the and keys to move through the form. Do not use until you are finished and ready to exit the screen. Fill in the appropriate fields of the Create User form (as listed in Create User Form) and press . The utility exits the form and creates the new user. -- Using SMIT to Create a Group: -- ----------------------------- Use the following procedure to create a group. Start SMIT by entering the following command: smit The utility displays the Main Menu. From the Main Menu, make the following selections: -Security and Users -Users -Add a Group to the System The utility displays a form for adding new group information. Type the group name in the Group Name field and press . The group name must be eight characters or less. The utility creates the new group, automatically assigns the next available GID, and exits the form Primary Authentication method of system: ---------------------------------------- To check whether root has a primary authentication method of SYSTEM, use the following command: # lsuser -a auth1 root If needed, change the value by using # chuser auth1=SYSTEM root 3.4 Add a user in HP-UX: ------------------------ -- Example 1: Add user john to the system with all of the default attributes. # useradd john Add the user john to the system with a UID of 222 and a primary group of staff. # useradd -u 222 -g staff john -- Example 2: => Add a user called guestuser as per following requirements => Primary group member of guests => Secondary group member of www and accounting => Shell must be /usr/bin/bash3 => Home directory must be /home/guestuser # useradd -g guests -G www,accounting -d /home/guests -s /home/guestuser/ -m guestuser # passwd guestuser 3.5 Add a user in Linux Redhat: ------------------------------- You can use tools like useradd or groupadd to create new users and groups from the shell prompt. But an easier way to manage users and groups is through the graphical application, User Manager. Users are described in the /etc/passwd file Groups are stored on Red Hat Linux in the /etc/group file. Or invoke the Gnome Linuxconf GUI Tool by typing "linuxconf". In Red Hat Linux, linuxconf is found in the /bin directory. ================================ 4. Change filemode, permissions: ================================ Permissions are given to: u = user g = group o = other/world a = all file/directory permissions (or also called "filemodes") are: r = read w = write x = execute special modes are: X = sets execute if already set (this one is particularly sexy, look below) s = set setuid/setgid bit t = set sticky bit Examples: --------- readable by all, everyone % chmod a+r essay.001 to remove read write and execute permissions on the file biglist for the group and others % chmod go-rwx biglist make executable: % chmod +x mycommand set mode: % chmod 644 filename rwxrwxrwx=777 rw-rw-rw-=666 rw-r--r--=644 corresponds to umask 022 r-xr-xr-x=555 rwxrwxr-x=775 1 = execute 2 = write 4 = read note that the total is 7 execute and read are: 1+4=5 read and write are: 2+4=6 read, write and exec: 1+2+4=7 and so on directories must always be executable... so a file with, say 640, means, the owner can read and write (4+2=6), the group can read (4) and everyone else has no permission to use the file (0). chmod -R a+X . This command would set the executable bit (for all users) of all directories and executables below the current directory that presently have an execute bit set. Very helpful when you want to set all your binary files executable for everyone other than you without having to set the executable bit of all your conf files, for instance. *wink* chmod -R g+w . This command would set all the contents below the current directory writable by your current group. chmod -R go-rwx This command would remove permissions for group and world users without changing the bits for the file owner. Now you don't have to worry that 'find . -type f -exec chmod 600 {}\;' will change your binary files non-executable. Further, you don't need to run an additional command to chmod your directories. chmod u+s /usr/bin/run_me_setuid This command would set the setuid bit of the file. It's simply easier than remembering which number to use when wanting to setuid/setgid, IMHO. ======================== 5. About the sticky bit: ======================== - This info is valid for most Unix OS including Solaris and AIX: ---------------------------------------------------------------- A 't' or 'T' as the last character of the "ls -l" mode characters indicates that the "sticky" (save text image) bit is set. See ls(1) for an explanation the distinction between 't' and 'T'. The sticky bit has a different meaning, depending on the type of file it is set on... sticky bit on directories ------------------------- [From chmod(2)] If the mode bit S_ISVTX (sticky bit) is set on a directory, files inside the directory may be renamed or removed only by the owner of the file, the owner of the directory, or the superuser (even if the modes of the directory would otherwise allow such an operation). [Example] drwxrwxrwt 104 bin bin 14336 Jun 7 00:59 /tmp Only root is permitted to turn the sticky bit on or off. In addition the sticky bit applies to anyone who accesses the file. The syntax for setting the sticky bit on a dir /foo directory is as follows: chmod +t /foo sticky bit on regular files --------------------------- [From chmod(2)] If an executable file is prepared for sharing, mode bit S_ISVTX prevents the system from abandoning the swap-space image of the program-text portion of the file when its last user terminates. Then, when the next user of the file executes it, the text need not be read from the file system but can simply be swapped in, thus saving time. [From HP-UX Kernel Tuning and Performance Guide] Local paging. When applications are located remotely, set the "sticky bit" on the applications binaries, using the chmod +t command. This tells the system to page the text to the local disk. Otherwise, it is "retrieved" across the network. Of course, this would only apply when there is actual paging occurring. More recently, there is a kernel parameter, page_text_to_local, which when set to 1, will tell the kernel to page all NFS executable text pages to local swap space. [Example] -r-xr-xr-t 6 bin bin 24111111111664 Nov 14 2000 /usr/bin/vi Solaris: -------- The sticky bit on a directory is a permission bit that protects files within that directory. If the directory has the sticky bit set, only the owner of the file, the owner of the directory, or root can delete the file. The sticky bit prevents a user from deleting other users' files from public directories, such as uucppublic: castle% ls -l /var/spool/uucppublic drwxrwxrwt 2 uucp uucp 512 Sep 10 18:06 uucppublic castle% When you set up a public directory on a TMPFS temporary file system, make sure that you set the sticky bit manually. You can set sticky bit permissions by using the chmod command to assign the octal value 1 as the first number in a series of four octal values. Use the following steps to set the sticky bit on a directory: 1. If you are not the owner of the file or directory, become superuser. 2. Type chmod <1nnn> and press Return. 3. Type ls -l and press Return to verify that the permissions of the file have changed. The following example sets the sticky bit permission on the pubdir directory: castle% chmod 1777 pubdir castle% ls -l pubdir drwxrwxrwt 2 winsor staff 512 Jul 15 21:23 pubdir castle% ================ 6. About SETUID: ================ Each process has three user ID's: the real user ID (ruid) the effective user ID (euid) and the saved user ID (suid) The real user ID identifies the owner of the process, the effective uid is used in most access control decisions, and the saved uid stores a previous user ID so that it can be restored later. Similar, a process has three group ID's. When a process is created by fork, it inherits the three uid's from the parent process. When a process executes a new file by exec..., it keeps its three uid's unless the set-user-ID bit of the new file is set, in which case the effective uid and saved uid are assigned the user ID of the owner of the new file. When setuid (set-user identification) permission is set on an executable file, a process that runs this file is granted access based on the owner of the file (usually root), rather than the user who created the process. This permission enables a user to access files and directories that are normally available only to the owner. The setuid permission is shown as an s in the file permissions. For example, the setuid permission on the passwd command enables a user to change passwords, assuming the permissions of the root ID are the following: castle% ls -l /usr/bin/passwd -r-sr-sr-x 3 root sys 96796 Jul 15 21:23 /usr/bin/passwd castle% You setuid permissions by using the chmod command to assign the octal value 4 as the first number in a series of four octal values. Use the following steps to setuid permissions: 1. If you are not the owner of the file or directory, become superuser. 2. Type chmod <4nnn> and press Return. 3. Type ls -l and press Return to verify that the permissions of the file have changed. The following example sets setuid permission on the myprog file: #chmod 4555 myprog -r-sr-xr-x 1 winsor staff 12796 Jul 15 21:23 myprog # The setgid (set-group identification) permission is similar to setuid, except that the effective group ID for the process is changed to the group owner of the file and a user is granted access based on permissions granted to that group. The /usr/bin/mail program has setgid permissions: castle% ls -l /usr/bin/mail -r-x-s-x 1 bin mail 64376 Jul 15 21:27 /usr/bin/mail castle% When setgid permission is applied to a directory, files subsequently created in the directory belong to the group the directory belongs to, not to the group the creating process belongs to. Any user who has write permission in the directory can create a file there; however, the file does not belong to the group of the user, but instead belongs to the group of the directory. You can set setgid permissions by using the chmod command to assign the octal value 2 as the first number in a series of four octal values. Use the following steps to set setgid permissions: 1. If you are not the owner of the file or directory, become superuser. 2. Type chmod <2nnn> and press Return. 3. Type ls -l and press Return to verify that the permissions of the file have changed. The following example sets setuid permission on the myprog2 file: #chmod 2551 myprog2 #ls -l myprog2 -r-xr-s-x 1 winsor staff 26876 Jul 15 21:23 myprog2 # ========================= 7. Find command examples: ========================= Introduction The find command allows the Unix user to process a set of files and/or directories in a file subtree. You can specify the following: where to search (pathname) what type of file to search for (-type: directories, data files, links) how to process the files (-exec: run a process against a selected file) the name of the file(s) (-name) perform logical operations on selections (-o and -a) Search for file with a specific name in a set of files (-name) EXAMPLES -------- # find . -name "rc.conf" -print This command will search in the current directory and all sub directories for a file named rc.conf. Note: The -print option will print out the path of any file that is found with that name. In general -print wil print out the path of any file that meets the find criteria. # find . -name "rc.conf" -exec chmod o+r '{}' \; This command will search in the current directory and all sub directories. All files named rc.conf will be processed by the chmod -o+r command. The argument '{}' inserts each found file into the chmod command line. The \; argument indicates the exec command line has ended. The end results of this command is all rc.conf files have the other permissions set to read access (if the operator is the owner of the file). How to find text in a set of files: ----------------------------------- # find . -exec grep "www.athabasca" '{}' \; -print This command will search in the current directory and all sub directories. All files that contain the string will have their path printed to standard output. # find . -exec grep "CI_ADJ_TYPE" {} \; -print This command search all subdirs all files to find text CI_ADJ_TYPE How to find files of certain size: ---------------------------------- # find / -xdev -size +2048 -ls | sort -r +6 # find . -xdev -size +2048 -ls | sort -r +6 This command will find all files in the root directory larger than 1 MB. How to find files between dates: -------------------------------- thread 1: --------- olddate="200407010001" newdat="200407312359" touch -t $olddate ./tmpoldfile touch -t $newdat ./tmpnewfile find /path/to/directory -type f -newer a ./tmpoldfile ! -newer a ./tmpnewfile the "-newer a " means access time, you can use "-newer m " for modify time thread 2: --------- Touch 2 files, start_date and stop_date, like this: $ touch -t 200603290000.00 start_date $ touch -t 200603290030.00 stop_date Ok, start_date is 03/29/06 midnight, stop_date is 03/29/06 30 minutes after midnight. You might want to do a ls -al to check. On to find, you can find -newer and then ! -newer, like this: $ find /dir -newer start_date ! -newer stop_date -print Combine that with ls -l, you get: $ find /dir -newer start_date ! -newer stop_date -print0 | xargs -0 ls -l (Or you can try -exec to execute ls -l. I am not sure of the format, so you have to muck around a little bit) HTH . thread 3: --------- ls -lrtR | awk '{print $6$7"\t"$9}'|grep Nov thread 4: --------- 1) between 2 dates (say 15 Aug 08 to 31 Aug 08) touch -t 20080150000 /tmp/start touch -t 20080831000 /tmp/finish find / -size +10k -newer /tmp/start -a ! -newer /tmp/finish 2) later than a specified date (say 25 Aug 08) touch -t 20080250000 /tmp/ref find / -size +10k ! -newer /tmp/ref Other examples: --------------- # find . -name file -print # find / -name $1 -exec ls -l {} \; # find / -user nep -exec ls -l {} \; >nepfiles.txt In English: search from the root directory for any files owned by nep and execute an ls -l on the file when any are found. Capture all output in nepfiles.txt. # find $HOME -name \*.txt -print In order to protect the asterisk from being expanded by the shell, it is necessary to use a backslash to escape the asterisk as in: # find / -atime +30 -print This prints files that have not been accessed in the last 30 days # find / -atime +100 -size +500000c -print The find search criteria can be combined. This command will locate and list all files that were last accessed more than 100 days ago, and whose size exceeds 500,000 bytes. # find /opt/bene/process/logs -name 'ALBRACHT*' -mtime +90 -exec rm {} \; # find /example /new/example -exec grep -l 'Where are you' {} \; # find / \( -name a.out -o -name '*.o' \) -atime +7 -exec rm {} \; # find . -name '*.trc' -mtime +3 -exec rm {} \; # find / -fsonly hfs -print # cd /; find . ! -path ./Disk -only -print | cpio -pdxm /Disk # cd /; find . -path ./Disk -prune -o -print | cpio -pdxm /Disk # cd /; find . -xdev -print | cpio -pdm /Disk # find -type f -print | xargs chmod 444 # find -type d -print | xargs chmod 555 # find . -atime +1 -name '*' -exec rm -f {} \; # find /tmp -atime +1 -name '*' -exec rm -f {} \; # find /usr/tmp -atime +1 -name '*' -exec rm -f {} \; # find / -name core -exec rm -f {} \; # find . -name "*.dbf" -mtime -2 -exec ls {} \; * Search and list all files from current directory and down for the string ABC: find ./ -name "*" -exec grep -H ABC {} \; find ./ -type f -print | xargs grep -H "ABC" /dev/null egrep -r ABC * * Find all files of a given type from current directory on down: find ./ -name "*.conf" -print * Find all user files larger than 5Mb: find /home -size +5000000c -print * Find all files owned by a user (defined by user id number. see /etc/passwd) on the system: (could take a very long time) find / -user 501 -print * Find all files created or updated in the last five minutes: (Great for finding effects of make install) find / -cmin -5 * Find all users in group 20 and change them to group 102: (execute as root) find / -group 20 -exec chown :102 {} \; * Find all suid and setgid executables: find / \( -perm -4000 -o -perm -2000 \) -type f -exec ls -ldb {} \; find / -type f -perm +6000 -ls Example: -------- cd /database/oradata/pegacc/archive archdir=`pwd` if [ $archdir=="/database/oradata/pegacc/archive" ] then find . -name "*.dbf" -mtime +5 -exec rm {} \; else echo "error in onderhoud PEGACC archives" >> /opt/app/oracle/admin/log/archmaint.log fi Example: -------- The following example shows how to find files larger than 400 blocks in the current directory: # find . -size +400 -print REAL COOL EXAMPLE: ------------------ This example could even help in recovery of a file: In some rare cases a strangely-named file will show itself in your directory and appear to be un-removable with the rm command. Here is will the use of ls -li and find with its -inum [inode] primary does the job. Let's say that ls -l shows your irremovable as -rw------- 1 smith smith 0 Feb 1 09:22 ?*?*P Type: ls -li to get the index node, or inode. 153805 -rw------- 1 smith smith 0 Feb 1 09:22 ?*?^P The inode for this file is 153805. Use find -inum [inode] to make sure that the file is correctly identified. % find -inum 153805 -print ./?*?*P Here, we see that it is. Then used the -exec functionality to do the remove. . % find . -inum 153805 -print -exec /bin/rm {} \; Note that if this strangely named file were not of zero-length, it might contain accidentally misplaced and wanted data. Then you might want to determine what kind of data the file contains and move the file to some temporary directory for further investigation, for example: % find . -inum 153805 -print -exec /bin/mv {} unknown.file \; Will rename the file to unknown.file, so you can easily inspect it. COOL EXAMPLE: Using find and cpio to create really good backups: ---------------------------------------------------------------- Suppose you have a lot of subdirs and files in "/dir1/dira" Now you want to copy, or backup, this to "/dir2/dirb" And not only just the files and subdirs, BUT ALSO all filemodes (permissions), ownership information, acl's etc.. Then DO NOT USE "cp -R" or something similar. Instead use "find" in combination with the "cpio" backup command. # cd /dir1/dira # find . | cpio -pvdm /dir2/dirb Note: difference betweeen mtime and atime: ------------------------------------------ In using the find command where you want to delete files older than a certain date, you can use commands like find . -name "*.log" -mtime +30 -exec rm {} \; or find . -name "*.dbf" -atime +30 -exec rm {} \; Why should you choose, or not choose, between atime and mtime? It is important to distinguish between a file or directory's change time (ctime), access time (atime), and modify time (mtime). ctime -- In UNIX, it is not possible to tell the actual creation time of a file. The ctime--change time-- is the time when changes were made to the file or directory's inode (owner, permissions, etc.). The ctime is also updated when the contents of a file change. It is needed by the dump command to determine if the file needs to be backed up. You can view the ctime with the ls -lc command. atime -- The atime--access time--is the time when the data of a file was last accessed. Displaying the contents of a file or executing a shell script will update a file's atime, for example. mtime -- The mtime--modify time--is the time when the actual contents of a file was last modified. This is the time displayed in a long directoring listing (ls -l). Thats why backup utilities use the mtime when performing incremental backups: When the utility reads the data for a file that is to be included in a backup, it does not affect the file's modification time, but it does affect the file's access time. So for most practical reasons, if you want to delete logfiles (or other files) older than a certain date, its best to use the mtime attribute. How to make those times visible? "ls -l" shows atime "ls -lc" shows ctime "ls -lm" shows mtime "istat filename" will show all three. pago-am1:/usr/local/bb>istat bb18b3.tar.gz Inode 20 on device 10/9 File Protection: rw-r--r-- Owner: 100(bb) Group: 100(bb) Link count: 1 Length 427247 bytes Last updated: Tue Aug 14 11:01:46 2001 Last modified: Thu Jun 21 07:36:32 2001 Last accessed: Thu Nov 01 20:38:46 2001 =================== 7. Crontab command: =================== Cron is uded to schedule or run periodically all sorts of executable programs or shell scripts, like backupruns, housekeeping jobs etc.. The crond daemon makes it all happen. Who has access to cron, is on most unixes determined by the "cron.allow" and "cron.deny" files. Every allowed user, can have it's own "crontab" file. The crontab of root, is typically used for system administrative jobs. On most unixes the relevant files can be found in: /var/spool/cron/crontabs or /var/adm/cron or /etc/cron.d For example, on Solaris the /var/adm/cron/cron.allow and /var/adm/cron/cron.deny files control which users can use the crontab command. Most common usage: - if you just want a listing: crontab -l - if you want to edit and change: crontab -e crontab [ -e | -l | -r | -v | File ] -e: edit, submit -r remove, -l list A crontab file contains entries for each cron job. Entries are separated by newline characters. Each crontab file entry contains six fields separated by spaces or tabs in the following form: minute hour day_of_month month weekday command 0 0 * 8 * /u/harry/bin/maintenance Notes: ------ Note 1: start and stop cron: ---------------------------- -- Solaris and some other unixes: The proper way to stop and restart cron are: # /etc/init.d/cron stop # /etc/init.d/cron start In Solaris 10 you could use the following command as well: # svcadm refresh cron # svcadm restart cron -- Other way to restart cron: In most unixes, cron is started by init and there is a record in the /etc/initab file which makes that happen. Check if your system has indeed a record of cron in the inittab file. The type of start should be "respawn", which means that should the superuser do a "kill -9 crond", the cron daemon is simply restarted again. Again, preferrably, there should be a stop and start script to restart cron. Especially on AIX, there is no true way to restart cron in a neat way. Not via the Recourse Control startscr command, or script, a standard method is available. Just kill crond and it will be restarted. -- On many linux distros: to restart the cron daemon, you could do either a "service crond restart" or a "service crond reload". Note 2: ------- Create a cronjobs file You can do this on your local computer in Notepad or you can create the file directly on your Virtual Server using your favorite UNIX text editor (pico, vi, etc). Your file should contain the following entries: MAILTO="USER@YOUR-DOMAIN.NAME" 0 1 1 1-12/3 * /usr/local/bin/vnukelog This will run the command "/usr/local/bin/vnukelog" (which clears all of your log files) at 1 AM on the first day of the first month of every quarter, or January, April, July, and October (1-12/3). Obviously, you will need to substitute a valid e-mail address in the place of "USER@YOUR-DOMAIN.NAME". If you have created this file on your local computer, FTP the file up to your Virtual Server and store it in your home directory under the name "cronjobs" (you can actually use any name you would like). Register your cronjobs file with the system After you have created your cronjobs file (and have uploaded it to your Virtual Server if applicable), you need to Telnet to your server and register the file with the cron system daemon. To do this, simply type: crontab cronjobs Or if you used a name other than "cronjobs", substitute the name you selected for the occurrence of "cronjobs" above. Note 3: ------- # use /bin/sh to run commands, no matter what /etc/passwd says SHELL=/bin/sh # mail any output to `paul', no matter whose crontab this is MAILTO=paul # # run five minutes after midnight, every day 5 6-18 * * * /opt/app/oracle/admin/scripts/grepora.sh # run at 2:15pm on the first of every month -- output mailed to paul 15 14 1 * * $HOME/bin/monthly # run at 10 pm on weekdays, annoy Joe 0 22 * * 1-5 mail -s "It's 10pm" joe%Joe,%%Where are your kids?% 23 0-23/2 * * * echo "run 23 minutes after midn, 2am, 4am ..., everyday" 5 4 * * sun echo "run at 5 after 4 every sunday" 2>&1 means: It means that standard error is redirected along with standard output. Standard error could be redirected to a different file, like ls > toto.txt 2> error.txt If your shell is csh or tcsh, you would redirect standard output and standard error like this lt >& toto.txt Csh or tcsh cannot redirect standard error separately. Note 4: ------- thread Q: > Isn't there a way to refresh cron to pick up changes made using > crontab -e? I made the changes but the specified jobs did not run. > I'm thinking I need to refresh cron to pick up the changes. Is this > true? Thanks. A: Crontab -e should do that for you, that's the whole point of using it rather than editing the file yourself. Why do you think the job didn't run? Post the crontab entry and the script. Give details of the version of Tru64 and the patch level. Then perhaps we can help you to figure out the real cause of the problem. Hope this helps A: I have seen the following problem when editing the cron file for another user: crontab -e idxxxxxx This changed the control file, when I verified with crontab -l the contents was correctly shown, but the cron daemon did not execute the new contents. To solve the problem, I needed to follow the following commands: su - idxxxxxx crontab -l |crontab This seems to work ... since then I prefer the following su - idxxxxxx crontab -e which seems to work also ... Note 5: ------- On AIX it is observed, that if the "daemon=" attribute of a user is set to be false, this user cannot use crontab, even if the account is placed in cron.allow. You need to set the attribute to "daemon=true". * daemon Defines whether the user can execute programs using the system * resource controller (SRC). Possible values: true or false. Note 6: ------- If you want to quick test the crontab of a user: su - user and put the following in the crontab of that user: * * * * * date >/tmp/elog After checking the /tmp/elog file, which will rapidly fills with dates, don't forget to remove the crontab entry shown above. Note 7: the at and atq commands: -------------------------------- On many unix systems the scheduling "at" command and "atq" commands are available. With "at", you can schedule commands, and with "atq" you can view all your, or other users, scheduled tasks. atq- Display the jobs queued to run at specified times For example, on Solaris: The at command is used to schedule jobs for execution at a later time. Unlike crontab, which schedules a job to happen at regular intervals, a job submitted with at executes once, at the designated time. To submit an at job, type at followed by the time that you would like the program to execute. You'll see the at> prompt displayed and it's here that you enter the at commands. When you are finished entering the at command, press control-d to exit the at prompt and submit the job as shown in the following example: # at 07:45am today at> who > /tmp/log at> job 912687240.a at Thu Jun 6 07:14:00 When you submit an at job, it is assigned a job identification number, which becomes its filename along with the .a extension. The file is stored in the /var/spool/cron/atjobs directory. In much the same way as it schedules crontab jobs, the cron daemon controls the scheduling of at files. =========================== 8. Job control, background: =========================== To put a sort job (or other job) in background: # sort < foo > bar & To show jobs: # jobs To show processes: # ps # ps -ef | grep ora Job in foreground -> background: Ctrl-Z (suspend) #bg or bg jobID Job in background -> foreground: # fg %jobid Stop a process: # kill -9 3535 (3535 is the pid, process id) Stop a background process you may try this: # kill -QUIT 3421 -- Kill all processes of a specific users: -- --------------------------------------- To kill all processes of a specific user, enter: # ps -u [user-id] -o pid | grep -v PID | xargs kill -9 Another way: Use who to check out your current users and their terminals. Kill all processes related to a specific terminal: # fuser -k /dev/pts[#] Yet another method: Su to the user-id you wish to kill all processes of and enter: # su - [user-id] -c kill -9 -1 Or su - to that userid, and use the killall command, which is available on most unix'es, like for example AIX. # killall So in order to kill all processes of a user: # kill -9 -1 # not on all unixes or # killall # not on all unixes The nohup command: ------------------ When working with the UNIX operating system, there will be times when you will want to run commands that are immune to log outs or unplanned login session terminations. This is especially true for UNIX system administrators. The UNIX command for handling this job is the nohup (no hangup) command. Normally when you log out, or your session terminates unexpectedly, the system will kill all processes you have started. Starting a command with nohup counters this by arranging for all stopped, running, and background jobs to ignore the SIGHUP signal. The syntax for nohup is: nohup command [arguments] You may optionally add an ampersand to the end of the command line to run the job in the background: nohup command [arguments] & If you do not redirect output from a process kicked off with nohup, both standard output (stdout) and standard error (stderr) are sent to a file named nohup.out. This file will be created in $HOME (your home directory) if it cannot be created in the working directory. Real-time monitoring of what is being written to nohup.out can be accomplished with the "tail -f nohup.out" command. Although the nohup command is extremely valuable to UNIX system administrators, it is also a must-know tool for others who run lengthy or critical processes on UNIX systems The nohup command runs the command specified by the Command parameter and any related Arg parameters, ignoring all hangup (SIGHUP) signals. Use the nohup command to run programs in the background after logging off. To run a nohup command in the background, add an & (ampersand) to the end of the command. Whether or not the nohup command output is redirected to a terminal, the output is appended to the nohup.out file in the current directory. If the nohup.out file is not writable in the current directory, the output is redirected to the $HOME/nohup.out file. If neither file can be created nor opened for appending, the command specified by the Command parameter is not invoked. If the standard error is a terminal, all output written by the named command to its standard error is redirected to the same file descriptor as the standard output. To run a command in the background after you log off, enter: $ nohup find / -print & After you enter this command, the following is displayed: 670 $ Sending output to nohup.out The process ID number changes to that of the background process started by & (ampersand). The message Sending output to nohup.out informs you that the output from the find / -print command is in the nohup.out file. You can log off after you see these messages, even if the find command is still running. Example of ps -ef on a AIX5 system: [LP 1]root@ol16u209:ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 Oct 17 - 0:00 /etc/init root 4198 1 0 Oct 17 - 0:00 /usr/lib/errdemon root 5808 1 0 Oct 17 - 1:15 /usr/sbin/syncd 60 oracle 6880 1 0 10:27:26 - 0:00 ora_lgwr_SPLDEV1 root 6966 1 0 Oct 17 - 0:00 /usr/ccs/bin/shlap root 7942 43364 0 Oct 17 - 0:00 sendmail: accepting connections alberts 9036 9864 0 20:41:49 - 0:00 sshd: alberts@pts/0 root 9864 44426 0 20:40:21 - 0:00 sshd: alberts [priv] root 27272 36280 1 20:48:03 pts/0 0:00 ps -ef oracle 27856 1 0 10:27:26 - 0:01 ora_smon_SPLDEV1 oracle 31738 1 0 10:27:26 - 0:00 ora_dbw0_SPLDEV1 oracle 31756 1 0 10:27:26 - 0:00 ora_reco_SPLDEV1 alberts 32542 9036 0 20:41:49 pts/0 0:00 -ksh maestro 33480 34394 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/batchman -parm 32000 root 34232 33480 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/jobman maestro 34394 45436 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/mailman -parm 32000 -- 2002 OL16U209 CONMAN UNIX 6. root 34708 1 0 13:55:51 lft0 0:00 /usr/sbin/getty /dev/console oracle 35364 1 0 10:27:26 - 0:01 ora_cjq0_SPLDEV1 oracle 35660 1 0 10:27:26 - 0:04 ora_pmon_SPLDEV1 root 36280 32542 0 20:45:06 pts/0 0:00 -ksh root 36382 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.ServiceRMd root 36642 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.CSMAgentRMd root 36912 43364 0 Oct 17 - 0:03 /usr/opt/ifor/bin/i4lmd -l /var/ifor/logdb -n clwts root 37186 43364 0 Oct 17 - 0:00 /etc/ncs/llbd root 37434 43364 0 Oct 17 - 0:17 /usr/opt/ifor/bin/i4llmd -b -n wcclwts -l /var/ifor/llmlg root 37738 37434 0 Oct 17 - 0:00 /usr/opt/ifor/bin/i4llmd -b -n wcclwts -l /var/ifor/llmlg root 37946 1 0 Oct 17 - 0:00 /opt/hitachi/HNTRLib2/bin/hntr2mon -d oracle 38194 1 0 Oct 17 - 0:00 /prj/oracle/product/9.2.0.3/bin/tnslsnr LISTENER -inherit root 38468 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.AuditRMd root 38716 1 0 Oct 17 - 0:00 /usr/bin/itesmdem itesrv.ini /etc/IMNSearch/search/ imnadm 39220 1 0 Oct 17 - 0:00 /usr/IMNSearch/httpdlite/httpdlite -r /etc/IMNSearch/httpdlite/httpdlite.con root 39504 36912 0 Oct 17 - 0:00 /usr/opt/ifor/bin/i4lmd -l /var/ifor/logdb -n clwts root 39738 43364 0 Oct 17 - 0:01 /usr/DynamicLinkManager/bin/dlmmgr root 40512 43364 0 Oct 17 - 0:01 /usr/sbin/rsct/bin/rmcd -r root 40784 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.ERrmd root 41062 1 0 Oct 17 - 0:00 /usr/sbin/cron was 41306 1 0 Oct 17 - 2:10 /prj/was/java/bin/java -Xmx256m -Dwas.status.socket=32776 -Xms50m -Xbootclas oracle 42400 1 0 10:27:26 - 0:02 ora_ckpt_SPLDEV1 root 42838 1 0 Oct 17 - 0:00 /usr/sbin/uprintfd root 43226 43364 0 Oct 17 - 0:00 /usr/sbin/nfsd 3891 root 43364 1 0 Oct 17 - 0:00 /usr/sbin/srcmstr root 43920 43364 0 Oct 17 - 0:00 /usr/sbin/aixmibd root 44426 43364 0 Oct 17 - 0:00 /usr/sbin/sshd -D root 44668 43364 0 Oct 17 - 0:00 /usr/sbin/portmap root 44942 43364 0 Oct 17 - 0:00 /usr/sbin/snmpd root 45176 43364 0 Oct 17 - 0:00 /usr/sbin/snmpmibd maestro 45436 1 0 Oct 17 - 0:00 /prj/maestro/maestro/bin/netman root 45722 43364 0 Oct 17 - 0:00 /usr/sbin/inetd root 45940 43364 0 Oct 17 - 0:00 /usr/sbin/muxatmd root 46472 43364 0 Oct 17 - 0:00 /usr/sbin/hostmibd root 46780 43364 0 Oct 17 - 0:00 /etc/ncs/glbd root 46980 43364 0 Oct 17 - 0:00 /usr/sbin/qdaemon root 47294 1 0 Oct 17 - 0:00 /usr/local/sbin/syslog-ng -f /usr/local/etc/syslog-ng.conf root 47484 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.lockd daemon 48014 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.statd root 48256 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.mountd root 48774 43364 0 Oct 17 - 0:00 /usr/sbin/biod 6 root 49058 43364 0 Oct 17 - 0:00 /usr/sbin/writesrv [LP 1]root@ol16u209: Another example of ps -ef on a AIX5 system: # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 Jan 23 - 0:33 /etc/init root 69706 1 0 Jan 23 - 0:00 /usr/lib/errdemon root 81940 1 0 Jan 23 - 0:00 /usr/sbin/srcmstr root 86120 1 2 Jan 23 - 236:39 /usr/sbin/syncd 60 root 98414 1 0 Jan 23 - 0:00 /usr/ccs/bin/shlap64 root 114802 81940 0 Jan 23 - 0:32 /usr/sbin/rsct/bin/IBM.CSMAgentRMd root 135366 81940 0 Jan 23 - 0:00 /usr/sbin/sshd -D root 139446 81940 0 Jan 23 - 0:07 /usr/sbin/rsct/bin/rmcd -r root 143438 1 0 Jan 23 - 0:00 /usr/sbin/uprintfd root 147694 1 0 Jan 23 - 0:26 /usr/sbin/cron root 155736 1 0 Jan 23 - 0:00 /usr/local/sbin/syslog-ng -f /usr/local/etc/syslog-ng.conf root 163996 81940 0 Jan 23 - 0:00 /usr/sbin/rsct/bin/IBM.ERrmd root 180226 81940 0 Jan 23 - 0:00 /usr/sbin/rsct/bin/IBM.ServiceRMd root 184406 81940 0 Jan 23 - 0:00 /usr/sbin/qdaemon root 200806 1 0 Jan 23 - 0:08 /opt/hitachi/HNTRLib2/bin/hntr2mon -d root 204906 81940 0 Jan 23 - 0:00 /usr/sbin/rsct/bin/IBM.AuditRMd root 217200 1 0 Jan 23 - 0:00 ./mflm_manager root 221298 81940 0 Jan 23 - 1:41 /usr/DynamicLinkManager/bin/dlmmgr root 614618 1 0 Apr 03 lft0 0:00 -ksh reserve 1364024 1548410 0 07:10:10 pts/0 0:00 -ksh root 1405140 1626318 1 08:01:38 pts/0 0:00 ps -ef root 1511556 614618 2 07:45:52 lft0 0:41 tar -cf /dev/rmt1.1 /spl reserve 1548410 1613896 0 07:10:10 - 0:00 sshd: reserve@pts/0 root 1613896 135366 0 07:10:01 - 0:00 sshd: reserve [priv] root 1626318 1364024 1 07:19:13 pts/0 0:00 -ksh Some more examples: # nohup somecommand & sleep 1; tail -f preferred-name # nohup make bzImage & # tail -f nohup.out # nohup make modules 1> modules.out 2> modules.err & # tail -f modules.out ========================================== 9. Backup commands, TAR, and Zipped files: ========================================== For SOLARIS as well as AIX, and many other unix'es, the following commands can be used: tar, cpio, dd, gzip/gunzip, compress/uncompress, backup and restore. Very important: If you will backup to tape, make sure you know what is your "rewinding" class and "nonrewinding" class of your tapedevice. 9.1 tar: Short for "Tape Archiver": =================================== Some examples should explain the usage of "tar" to create backups, or to create easy to transport .tar files. Create a backup to tape device 0hc of file sys01.dbf # tar -cvf /dev/rmt/0hc /u01/oradata/sys01.dbf # tar -rvf /dev/rmt/0hc /u02/oradata/data_01.dbf -c create -r append -x extract -v verbose -t list Extract the contents of example.tar and display the files as they are extracted. # tar -xvf example.tar Create a tar file named backup.tar from the contents of the directory /home/ftp/pub # tar -cf backup.tar /home/ftp/pub list contents of example.tar to the screen # tar -tvf example.tar to restore the file /home/bcalkins/.profile from the archive: - First we do a backup: # tar -cvf /dev/rmt/0 /home/bcalkins - And later we do a restore: # tar -xcf /dev/rmt/0 /home/bcalkins/.profile If you use an absolute path, you can only restore in "a like" destination directory. If you use a relative path, you can restore in any directory. In this case, use tar with a relative pathname, for example if you want to backup /home/bcalkins change to that directory and use # tar -cvf backup_oracle_201105.tar ./* To extract the directory conv: # tar -xvf /dev/rmt0 /u02/oradata/conv Example: -------- mt -f /dev/rmt1 rewind mt -f /dev/rmt1.1 fsf 6 tar -xvf /dev/rmt1.1 /data/download/expdemo.zip Most common errors messages with tar: ------------------------------------- -- 0511-169: A directory checksum error on media: MediaName not equal to Number Possible Causes From the command line, you issued the tar command to extract files from an archive that was not created with the tar command. -- 0511-193: An error occurred while reading from the media Possible Causes You issued the tar command to read an archive from a tape device that has a different block size than when the archive was created. Solution: # chdev -l rmt0 -a block_size=0 -- File too large: Extra note of tar command on AIX: --------------------------------- If you need to backup multiple large mountpoints to a large tape, you might think you can use something like: tar -cvf /dev/rmt1 /spl tar -rvf /dev/rmt1 /prj tar -rvf /dev/rmt1 /opt tar -rvf /dev/rmt1 /usr tar -rvf /dev/rmt1 /data tar -rvf /dev/rmt1 /backups tar -rvf /dev/rmt1 /u01/oradata tar -rvf /dev/rmt1 /u02/oradata tar -rvf /dev/rmt1 /u03/oradata tar -rvf /dev/rmt1 /u04/oradata tar -rvf /dev/rmt1 /u05/oradata Actually on AIX this is not OK. The tape will rewind after each tar command, effectively you will end up with ONLY the last backupstatement. You should use the non-rewinding class instead, like for example: tar -cf /dev/rmt1.1 /spl tar -cf /dev/rmt1.1 /apps tar -cf /dev/rmt1.1 /prj tar -cf /dev/rmt1.1 /software tar -cf /dev/rmt1.1 /opt tar -cf /dev/rmt1.1 /usr tar -cf /dev/rmt1.1 /data tar -cf /dev/rmt1.1 /backups #tar -cf /dev/rmt1.1 /u01/oradata #tar -cf /dev/rmt1.1 /u02/oradata #tar -cf /dev/rmt1.1 /u03/oradata #tar -cf /dev/rmt1.1 /u04/oradata #tar -cf /dev/rmt1.1 /u05/oradata Use this table to decide on which class to use: The following table shows the names of the rmt special files and their characteristics. Special_File Rewind_on_Close Retension_on_Open Density_Setting /dev/rmt*Yes No #1 /dev/rmt*.1No No #1 /dev/rmt*.2Yes Yes #1 /dev/rmt*.3No Yes #1 /dev/rmt*.4Yes No #2 /dev/rmt*.5No No #2 /dev/rmt*.6Yes Yes #2 /dev/rmt*.7No Yes #2 To restore an item from a logical tape, use commands as in the following example: mt -f /dev/rmt1 rewind mt -f /dev/rmt1.1 fsf 2 in order to put the pointer to the beginning of block 3. mt -f /dev/rmt1.1 fsf 7 in order to put the pointer to the beginning of block 8. Now you can use a command like for example: tar -xvf /dev/rmt1.1 /backups/oradb/sqlnet.log Another example: mt -f /dev/rmt1 rewind mt -f /dev/rmt1.1 fsf 8 tar -xvf /dev/rmt1.1 /u01/oradata/spltrain/temp01.dbf Tapedrives on Solaris: ---------------------- Tape dvices on Solaris are named like /dev/rmt/0 or /dev/rmt/1 The default is /dev/rmt0. This also configured in the "/kernel/drv/st.conf" file. If you need to add support for a tape device, you need to modify this file. First tape device name: /dev/rmt/0 Second tape device name: /dev/rmt/1 You can also add special character letter to specify density using following format /dev/rmt/ZX Z is tape drive number such as 0,1..n X can be any one of following (as supported by your device, read the manual of your tape device & controller to see if all of them supported or not): l - Low density m - Medium density h - High density u - Ultra density c - Compressed density n - No rewinding For example to specify the first, drive with high-density with no rewinding use device /dev/rmt/0hn. First drive, rewinding /dev/rmt/0 First drive, nonrewinding /dev/rmt/0n Second drive, rewinding /dev/rmt/1 Second drive, nonrewinding /dev/rmt/1n Example Backupscript on AIX: ---------------------------- #!/usr/bin/ksh # BACKUP-SCRIPT SPL SERVER PSERIES 550 # DIT IS DE PRIMAIRE BACKUP, NAAR DE TAPEROBOT RMT1. # OPMERKING: ER LOOPT NAAST DEZE BACKUP, OOK NOG EEN BACKUP VAN DE # /backup DISK NAAR DE INTERNE TAPEDRIVE RMT0. # OMDAT WE NOG NIET GEHEEL IN BEELD HEBBEN OF WE VOORAF DE BACKUP APPLICATIES MOETEN # STOPZETTEN, IS DIT SCRIPT NOG IN REVISIE. # VERSIE: 0.1 # DATUM : 27-12-2005 # DOEL VAN HET SCRIPT: # - STOPPEN VAN DE APPLICATIES # - VERVOLGENS BACKUP NAAR TAPE # - STARTEN VAN DE APPLICATIES # CONTROLEER VOORAF OF DE TAPELIBRARY GELADEN IS VIA "/opt/backupscripts/load_lib.sh" BACKUPLOG=/opt/backupscripts/backup_to_rmt1.log export BACKUPLOG DAYNAME=`date +%a`;export DAYNAME DAYNO=`date +%d`;export DAYNO ######################################## # 1. REGISTRATIE STARTTIJD IN EEN LOG # ######################################## echo "-----------------" >> ${BACKUPLOG} echo "Start Backup 550:" >> ${BACKUPLOG} date >> ${BACKUPLOG} ######################################## # 2. STOPPEN APPLICATIES # ######################################## #STOPPEN VAN ALLE ORACLE DATABASES su - oracle -c "/opt/backupscripts/stop_oracle.sh" sleep 30 #STOPPEN VAN WEBSPHERE cd /prj/was/bin ./stopServer.sh server1 -username admin01 -password vga88nt sleep 30 #SHUTDOWN ETM instances: su - cissys -c '/spl/SPLDEV1/bin/splenviron.sh -e SPLDEV1 -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLDEV2/bin/splenviron.sh -e SPLDEV2 -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLCONF/bin/splenviron.sh -e SPLCONF -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLPLAY/bin/splenviron.sh -e SPLPLAY -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLTST3/bin/splenviron.sh -e SPLTST3 -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLTST1/bin/splenviron.sh -e SPLTST1 -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLTST2/bin/splenviron.sh -e SPLTST2 -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLDEVP/bin/splenviron.sh -e SPLDEVP -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLPACK/bin/splenviron.sh -e SPLPACK -c "spl.sh -t stop"' sleep 2 su - cissys -c '/spl/SPLDEVT/bin/splenviron.sh -e SPLDEVT -c "spl.sh -t stop"' sleep 2 #STOPPEN SSH DEMON stopsrc -s sshd sleep 2 date >> /opt/backupscripts/running.log who >> /opt/backupscripts/running.log ######################################## # 3. BACKUP COMMANDS # ######################################## case $DAYNAME in Tue) tapeutil -f /dev/smc0 move 256 4116 tapeutil -f /dev/smc0 move 4101 256 ;; Wed) tapeutil -f /dev/smc0 move 256 4117 tapeutil -f /dev/smc0 move 4100 256 ;; Thu) tapeutil -f /dev/smc0 move 256 4118 tapeutil -f /dev/smc0 move 4099 256 ;; Fri) tapeutil -f /dev/smc0 move 256 4119 tapeutil -f /dev/smc0 move 4098 256 ;; Sat) tapeutil -f /dev/smc0 move 256 4120 tapeutil -f /dev/smc0 move 4097 256 ;; Mon) tapeutil -f /dev/smc0 move 256 4121 tapeutil -f /dev/smc0 move 4096 256 ;; esac sleep 50 echo "Starten van de backup zelf" >> ${BACKUPLOG} mt -f /dev/rmt1 rewind tar -cf /dev/rmt1.1 /spl tar -cf /dev/rmt1.1 /apps tar -cf /dev/rmt1.1 /prj tar -cf /dev/rmt1.1 /software tar -cf /dev/rmt1.1 /opt tar -cf /dev/rmt1.1 /usr tar -cf /dev/rmt1.1 /data tar -cf /dev/rmt1.1 /backups tar -cf /dev/rmt1.1 /u01/oradata tar -cf /dev/rmt1.1 /u02/oradata tar -cf /dev/rmt1.1 /u03/oradata tar -cf /dev/rmt1.1 /u04/oradata tar -cf /dev/rmt1.1 /u05/oradata tar -cf /dev/rmt1.1 /u06/oradata tar -cf /dev/rmt1.1 /u07/oradata tar -cf /dev/rmt1.1 /u08/oradata tar -cf /dev/rmt1.1 /home tar -cf /dev/rmt1.1 /backups3 sleep 10 # TIJDELIJKE ACTIE date >> /opt/backupscripts/running.log ps -ef | grep pmon >> /opt/backupscripts/running.log ps -ef | grep BBL >> /opt/backupscripts/running.log ps -ef | grep was >> /opt/backupscripts/running.log who >> /opt/backupscripts/running.log defragfs /prj # EIND TIJDELIJKE ACTIE ######################################## # 4. STARTEN APPLICATIES # ######################################## #STARTEN SSH DEMON startsrc -s sshd sleep 2 #STARTEN VAN ALLE ORACLE DATABASES su - oracle -c "/opt/backupscripts/start_oracle.sh" sleep 30 #STARTEN ETM instances: su - cissys -c '/spl/SPLDEV1/bin/splenviron.sh -e SPLDEV1 -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLDEV2/bin/splenviron.sh -e SPLDEV2 -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLCONF/bin/splenviron.sh -e SPLCONF -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLPLAY/bin/splenviron.sh -e SPLPLAY -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLTST3/bin/splenviron.sh -e SPLTST3 -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLTST1/bin/splenviron.sh -e SPLTST1 -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLTST2/bin/splenviron.sh -e SPLTST2 -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLDEVP/bin/splenviron.sh -e SPLDEVP -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLPACK/bin/splenviron.sh -e SPLPACK -c "spl.sh -t start"' sleep 2 su - cissys -c '/spl/SPLDEVT/bin/splenviron.sh -e SPLDEVT -c "spl.sh -t start"' sleep 2 #STARTEN VAN WEBSPHERE cd /prj/was/bin ./startServer.sh server1 -username admin01 -password vga88nt sleep 30 ######################################## # 5. REGISTRATIE EINDTIJD IN EEN LOG # ######################################## #Laten we het tapenummer en einddtijd registreren in de log: tapeutil -f /dev/smc0 inventory | head -88 | tail -2 >> ${BACKUPLOG} echo "Einde backup 550:" >> ${BACKUPLOG} date >> ${BACKUPLOG} Some examples about day vars: ----------------------------- DAYNAME=`date +%a`;export DAYNAME echo $DAYNAME Thu DAYNO=`date +%d`;export DAYNO echo $DAYNO 29 weekday=`date +%a%A`; export weekday echo $weekday ThuThursday weekday=`date +%a-%A` echo $weekday Thu-Thursday %a Displays the locale's abbreviated weekday name. %A Displays the locale's full weekday name. %b Displays the locale's abbreviated month name. %B Displays the locale's full month name. %c Displays the locale's appropriate date and time representation. This is the default. %C Displays the first two digits of the four-digit year as a decimal number (00-99). A year is divided by 100 and truncated to an integer. %d Displays the day of the month as a decimal number (01-31). In a two-digit field, a 0 is used as leading space fill. %D Displays the date in the format equivalent to %m/%d/%y. %e Displays the day of the month as a decimal number (1-31). In a two-digit field, a blank space is used as leading space fill. 9.2 compress and uncompress: ============================ # compress -v bigfile.exe Would compress bigfile.exe and rename that file to bigfile.exe.Z. # uncompress *.Z would uncompress the files *.Z 9.3 gzip: ========= To compress a file using gzip, execute the following command: # gzip filename.tar This will become filename.tar.gz To decompress: # gzip -d filename.tar.gz # gunzip filename.tar.gz # gzip -d users.dbf.gz 9.4 bzip2: ========== #bzip2 filename.tar This will become filename.tar.bz2 9.5 dd: ======= Solaris: -------- # dd if= of= to duplicate a tape: # dd if=/dev/rmt/0 of=/dev/rmt/1 to clone a disk with the same geometry: # dd if=/dev/rdsk/c0t1d0s2 of=/dev/rdsk/c0t4d0s2 bs=128 AIX: ---- same command syntax apply to IBM AIX. Here is an AIX pSeries machine with floppydrive example: clone a diskette: # dd if=/dev/fd0 of=/tmp/ddcopy # dd if=/tmp/ddcopy of=/dev/fd0 Note: On Linux distros the device associated to the floppy drive is also /dev/fd0 9.6 cpio: ========= solaris: -------- cpio