/******************************************************************************/
/* Document   : Some simple and Generic Linux commands, to get info quickly.  */
/* Version    : 39                                                            */
/* File       : linux.txt                                                     */
/* Purpose    : Simple listing of common commands for a quick start in Linux. */
/* Date       : 10-10-2012                                                    */
/* Compiled by: Albert van der Sel                                            */
/* Note       : This file is especially meant to find info on Linux systems.  */
/*                                                                            */
/*                                                                            */
/*                                                                            */
/******************************************************************************/


Hopefully, it can be of use in some circumstances.


Contents:

1.  Disk & Filesystem Info
2.  CPU Info
3.  Memory Info
4.  Version - Release Info
5.  Kernel modules Info
6.  Process info
7.  How local disks / partitions are named
8.  Netcard and Network Info
9.  A few notes on how to change your IP parameters
10. Some Monitoring commands
11. The shortest possible "vi" or "vim" survivalkit
12. Some remarks about booting Linux (bare metal)
13. Some remarks about Package Management 
14. Some notes on Linux log files
15. A few words on cron, the default scheduler
16. A few words on User Accounts
17. The standard Linux filesystems
18. Some remarks about Linux as a Virtual Machine (VM)
19. A few words on Linux VM's under Xen and XenServer 
20. A few words on Linux VM's under VMWare ESX(i)
21. A few words on Volumes and filesystems
22. Some remarks on how to autostart daemons on boottime
23. Some remarks on how to restart daemons on a running system
24. Some SAN and SCSI talk
25. Some notes on Disaster Recovery
26. Recovering the root password
27. A few notes on installing the driver or modules of SAN HBA cards
28. Some special filesystems.
29. A few typical examples of partitioning and creating filesystems
30. A few notes on implementing multipath IO to a SAN


============================================================================
1. Disk & Filesystem Info:
============================================================================

# Only listings of filesystems/partitions/disks here.

-- disk / partition / device info:

# blkid                  # shows information about available block devices
# blkid /dev/sda         # shows information about sda only


# cat /proc/partitions
# fdisk -l
# fdisk -l /dev/sda
# lshw -class disk
# lshw -C disk
# ls /dev/disk/by-id
# sfdisk -l 
# sfdisk -l /dev/sda
# lsscsi
# lsblk -f               # if available on your system, it shows a 
                         # tree of partitions and filesystem types

This might work too:

# smartctl -i /dev/sda     # maybe you need to install the package first
# hwinfo --disk            # maybe you need to install the package first

-- filesystems free/used, and where it's mounted on:

# df                     # show filesystem info
# df -k | grep tmp       # only show tmp (grep it on tmp)
# df -h                  # human readable output
# df -m                  # in Megabytes
# df -h /tmp             # only show tmp
# df -T                  # shows filesystem type too (like ext2, ext3, ntfs etc..)
# df /dev/hda3 
# cat /etc/fstab         # list the fstab file to view the standard mounts


-- scsi / lun related:

# ls -al /sys/class/scsi_host  # shows HBA's
# ls -al /sys/class/fc_host/   # shows FC HBA's
# cat /proc/scsi/scsi          # shows devices and LUNs

# ls -al /sys/class/scsi_disk  # Might show you luns in the form 
                               # of paths [host#:bus#:target#:lun#]
# lsscsi -c                                  

-- Mounts:

# mount
# cat /etc/fstab

-- usb:

# lsusb
# lsusb -v                 # verbose output

-- swap info:

# swapon -s
# cat /proc/swaps
# cat /proc/meminfo

-- list raw partitions:

# raw -qa
# ls -lR /dev/raw*


============================================================================
2. CPU Info:
============================================================================

# cat /proc/cpuinfo
# dmesg | grep -i cpu

This might work too:

# lscpu
# lshw -class cpu
# lshw -class cpu -short                           # limited output
# dmidecode --type 4                               # reads DMI table

cpu Usage:

# top
# mpstat
# mpstat -P ALL


============================================================================
3. Memory Info:
============================================================================

# cat /proc/meminfo
# dmesg | grep -i memory
# free
# free -m               # in MB

This might work too (showing some advanced properties):

# dmidecode --type 17


-> Note: What exactly is that "/proc" stuff? And "sysfs"?

The pseudo or virtual "/proc" filesystem on a running system, 
can be seen as a sort of "window" to view kernel data structures.
Here, subdirectories exists for all running processes, as well as for system
resources, that is, the values of swap, memory, disks, cpu etc..

In most cases, consider it to be as "read only". However, in some cases
you can use it to send information to the kernel as well.

Also, whenever you hear of a "virtual filesystem", it means that
it's memory based, build when the system boots, and maintained during runtime.

In a sense, a newer, more structured version of proc is available (since 
kernel 2.6), which is called "sysfs". This too is a virtual filesystem, and it 
sort of exports the "device tree", and system information, through the use 
of such a virtual filesystem. You can see it by browsing through "/sys".

You might say that "/proc" is more focussed on processes, while "/sys" is
a new way to obtain device- and system information.


============================================================================
4. Version - Release Info
============================================================================

# cat /proc/version
# uname -r, uname -a
# cat /etc/redhat-release                  # Specific for Redhat
# cat /etc/SuSE-release                    # Specific for SuSE

This might work too:

# lsb_release -a
# cat /etc/*issue
# cat /etc/*release

-- kernel locations:

The running one is most often loaded from /boot/vmlinuz*
You might find here several links here.
However, in general, a Linux kernel image might be located in either / or /boot.
Use "uname -a" to show the kernel version.

-- 32 bit or 64 bit system?

# uname -m


============================================================================
5. Kernel modules Info:
============================================================================

# lsmod                 #Lists all the currently loaded kernel modules 
# rmmod                 #Unloads modules, Ex: rmmod ftape 
# depmod                #Creates a dependency file, "modules.dep", later used by  
                        #modprobe to automatically load the relevant modules. 
# modprobe              #Used to load a module or set of modules. Loads all 
                        #modules specified in the file "modules.dep". 
# modinfo               #Shows module information

modprobe adds or removes a module from the Linux kernel.
You might say that "modprobe" supersedes commands like the more basic "insmod" and 
"rmmod" utilities.
Linux maintains /lib/modules/$(uname-r) directory for modules and its 
configuration files (except /etc/modprobe.conf and /etc/modprobe.d).

- modprobe.conf: modprobe checks /etc/modprobe.conf.
- modules.dep  : List of module dependencies, and will be checked too.

# modprobe -l             # display all available modules
# modprobe -l abc*        # list all abc* modules
# lsmod                   # displays all loaded modules
# modprobe thismodule     # loads the module
# modprobe -r thismodule  # removes the module from the kernel


============================================================================
6. Process info and control:
============================================================================

-- Show processes:

# w                                 # w command: list of who is logged on
# w -h                              # without header
# who                               # who is logged on
# users                             # who is logged on

# ps -A                             # show all processes 
                                      
# ps -ef                            # show all processes. This is the common unix 
                                    # usage of ps, to show all, including pid, path.
# ps aux | less                     # show all processes, one screen at the time 
                                    # (due to the pipe to less)
# ps -A | grep -i WhatEver          # show processes, but filtered on WhatEver
# pgrep WhatEver                    # show processes, but filtered on WhatEver
# ps -u john                        # show processes of john

# top                               # wellknow utility showing processes 
                                    # and many properties like mem usage, cpu usage

# mpstat                            # top and mpstat can show cpu% usage of pid's
# mpstat -P ALL


# htop                              # like an improved "top", but usually  
                                    # it needs to be installed.

# ptree                             # show processes in tree format

# pmap -d pid                       # shows the memory map of a process (pid)

-- Kill a process

# kill -9 pid                       # pid is the process id found with "ps -A"
# killall whatever                  # kill a process by it's name
# pkill whatever                    # kill a process by it's name
# xkill                             # a way to kill a graphical x program

-- set priority of running process

# renice 20 123                     # set prio of pid 123 to 20

-- start a program in the background, so that the prompt returns at you terminal

# myprg &                           # using "&" places it in the background
# jobs                              # view your running jobs

-- detach a program from your terminal, so that it keeps running

# nohup myprg &                     # the "no hangup" nohup command 
                                    # does the magic


============================================================================
7. How local disks / partitions are named:
============================================================================

See section 4 on how to list disks and partitions.
This section is only about device naming.

Here you find information on *local* disc devices.
For more info on SAN LUN's, please see Chapter 27.

=> Entire local harddisks are listed as devices without numbers, 
   such as "/dev/hda" or "/dev/sda" or "/dev/sga" etc...

The "standard" situation looks like this:

/dev/sda                        - first SCSI disk (address-wise)        
/dev/sdb                        - second SCSI disk (address-wise)       
/dev/hda                        - master disk on IDE primary controller         
/dev/hdb                        - slave disk on IDE primary controller  
/dev/hdc                        - master disk on secondary controller
/dev/hdd                        - slave disk on secondary controller


Note:
-----

There are some "deviations" for internal disks, especially using older hardware.

With some distributions, using some specific older disk Array hardware, 
you might see the standard disk devices notated in a different way.
like for example:

/dev/cciss/c0d0         Controller 0, disk 0, whole device
/dev/cciss/c0d0p1       Controller 0, disk 0, partition 1
/dev/cciss/c0d0p2       Controller 0, disk 0, partition 2
/dev/cciss/c0d0p3       Controller 0, disk 0, partition 3

/dev/cciss/c1d1         Controller 1, disk 1, whole device
/dev/cciss/c1d1p1       Controller 1, disk 1, partition 1
/dev/cciss/c1d1p2       Controller 1, disk 1, partition 2
/dev/cciss/c1d1p3       Controller 1, disk 1, partition 3

Often, these "cciss devices" are associated with HP Smart Array block drivers,
This is locally attached hardware, where Volumes are used as the standard disks.


=> CDROM / DVD devices:

- To get information on CD/DVD drives use:

# cdrecord -scanbus

It displays information about your CD-R or CD-RW drive.
You might see output like:

scsibus0:
        0,0,0     0) 'SONY    ' 'CD-RW' 'NM56' 'Removable CD-RW'

The first three numbers (for each item) refer to SCSI bus, device ID,
and LUN  (Logical Unit Number), respectively.
The fourth number is scsi device again.

If you want "to burn" a CD/DVD, those 3 numbers is what the
"cdrecord" command wants to know for the device address.


- Device files:

CDROM like drives device file name is either /dev/cdrom, /dev/sr<n>, 
/dev/scd<n>, or /dev/dvd. 

But, sometimes the drive has a "disklike" devicefile like "/dev/hdb".
Especially if your system has only one internal disk, and if a
CD/DVD is a second device on the IDE primary controller

So, you might see:

/dev/cdrom                      - first CDROM device
/dev/scd0, or /dev/sr0          - first SCSI CDROM/DVD 
/dev/hdc                        - CDROM/DVD on IDE, or
/dev/hdb                        - CDROM/DVD on IDE
/dev/dvd                        - DVD

If you have IDE/ATAPI, but SCSI emulation is taken over,
the device file has changed from something like /dev/cdrom0, or /dev/hdc, 
to /dev/scd0 

You can also inspect the output of the following command:

#dmesg | grep '^hd.:'

Note:

On systems with an IDE/ATAPI CD-Rom, often scd0 is linked to 
/dev/cdrom (scsi emulation).

# ln -s /dev/scd0 /dev/cdrom


=> USB:

Often an USB device is reckognized as /dev/sdb1.  

=> Partitions:

Partitions on a disk are referred to with a number such as:

/dev/hda1  
/dev/sda1

So, for example, you could use fdisk to partion /dev/sda as

Device           Boot    Start   End   Blocks   Id   System
/dev/sda1                1       255   2048256  83   Linux
/dev/sda2                256     511   2056320  82   Swap
/dev/sda3                512    5721  41849325  83   Linux

=> Naming in GRUB (at the start of Linux boot):

The bootloader GRUB (formerly LILO) uses a naming like:

(hd0,0) : meaning first harddisk, first partition
(hd0,1) : meaning first harddisk, second partition
(hd1,5) : meaning second harddisk, sixth partition

This is a universal naming too.
Usually, "/boot/grub/device.map" associates those entries to
device files like /dev/sda


============================================================================
8. Netcard and Network Info:
============================================================================

-- Listing network interfaces and IP parameters:

# lshw -class network 
# ifconfig
# lspci                                           # list all pci devices
# lspci | grep -i eth                             # as above, but now grepped 
                                                  # (filtered) on "eth"
# lspci | egrep -i --color 'wifi|wlan|wireless'
# dmesg | grep eth


You can also check some files (using "cat file_name") to find info on
netcard devices. 
But it depends a bit on your distribution, which files you should inspect. 

You might try to take a look in the following directories (or files)
(if they exists on your system):

/etc/network                                          (directory)
/etc/network/interfaces                               (directory)
/etc/network/interfaces                               (as a file)
/etc/sysconfig/network                                (as a file)
/etc/sysconfig/network-scripts/ifcfg-<interface-name> (as a file)

Here is an example on RedHat:

[root@linRH507 /etc/sysconfig/network-scripts]# ls -al

total 56
drwxr-xr-x 2 root root 4096 Sep 30  2008 .
drwxr-xr-x 4 root root 4096 Sep 29  2008 ..
-rw-r--r-- 3 root root  164 Sep 30  2008 ifcfg-bond0
-rw-r--r-- 3 root root  143 Sep 30  2008 ifcfg-bond1
-rw-r--r-- 3 root root  172 Sep 30  2008 ifcfg-eth0
-rw-r--r-- 3 root root  172 Sep 30  2008 ifcfg-eth1
-rw-r--r-- 3 root root  172 Sep 30  2008 ifcfg-eth2
-rw-r--r-- 3 root root  172 Sep 30  2008 ifcfg-eth3

[root@linRH507 /etc/sysconfig/network-scripts]# cat ifcfg-bond0

# Bonding van eth0 en eth1 : public interface
DEVICE=bond0
BOOTPROTO=none
IPADDR=10.132.68.11
NETMASK=255.255.254.0
ONBOOT=yes
GATEWAY=10.132.69.254
TYPE=Ethernet  

In the example above, I used "cat ifcfg-bond0" to find the
IP parameters on a "teamed interface", called "bond0",
which uses eth0 and eth1 together as one "team".
That makes no difference: bond0 just "acts" as one interface.


You can also use some "network stats" commands, which are designed to show you 
network trafic/stats info.
But often they list the interfaces too. For example:

# netstat -nr
# netstat -i

In many cases, the netcards are listed as the "eth0", "eth1" devices, 
and others like "lo" (loopback). 
Certainly, other device names are possible too. It just depends on your system.

-- Display Ethernet Card Settings (supposing you have the interface "eth0").

# dmesg |grep eth0
# ethtool eth0
# grep eth0 /etc/modules.conf


============================================================================
9. A few notes on how to change your IP parameters:
============================================================================

On most systems, you can edit network configuration files, in order to change
network- or IP related parameters (like hostname, IP address, mask etc..)

Using the commandline (like ifconfig) is another option.


9.1 Editing files:
------------------

=> For example in RedHat, editing config files:

The configurations for each network device you have, are located in the
"/etc/sysconfig/network-scripts/" directory.
These configfiles have names like ifcfg-eth0, ifcfg-eth1 etc.. 
Here is an example ifcfg-eth0

DEVICE=eth0
BOOTPROTO=none
IPADDR=10.10.10.11
NETMASK=255.255.255.0
ONBOOT=yes
GATEWAY=10.10.10.254
TYPE=Ethernet  

With "vi" or "vim" you can edit the file, and change the address and mask, and
optionally other parameters. See Chapter 11 for a extremely short intro on "vim".

Editing the configuration files, will make the change permanently.

When done, you need to restart your network services, like so:

# service network restart


=> For example, on Ubuntu:

Check out the "/etc/network/interfaces" file

vi that file, and you see records like:

iface eth0 inet static
address 10.10.10.11
netmask 255.255.255.0
network 10.10.10.0
broadcast 10.10.10.255
gateway 10.10.10.254

Make changes as neccessary, and save that file.
Next, restart your network services, like so:

# /etc/init.d/networking restart


9.2 Using the "ifconfig" command:
---------------------------------

# ifconfig eth0                        # show all parameters of eth0
# ifconfig eth0 down                   # stop networking on eth0
# ifconfig eth0 192.168.99.14          # configure parameters,
      netmask 255.255.255.0 up         # and bring the interface up

In most cases, it's not suited for making "permanent" configurations. 
Edit the appropriate config file to make permanent changes.


============================================================================
10. Some Monitoring commands:
============================================================================

On "monitoring" your system(s), you might think of two ways to do so:

- real time monitoring, that is, interactively looking at what processes are 
  running and what resources they use, or looking at general disk IO, 
  or memory- and cpu usage.

- performance stats gathering, which lets you view reports based on a 
  certain time period (per day, this last week etc..)


Here, we touch on "using commands" which often means that you take (a real time) 
look at how your system is performing now (or for some short duration). 
Here are some well know tools. Just listing the names (as I do right now), 
will not do much good ofcourse. You SHOULD REALLY try them.

# top            # this shows dynamically all processes and what resources 
                 # (like %cpu) they use.
                 # It shows a graphical screen, but in text mode.

# htop           # Many view it as the successor of "top".

# mpstat         # the three on the left, primarily focusses on cpu usage
# mpstat -P ALL
# sysstat   

# iostat         # this tool focusses on disk IO, and shows you several statistics.
                 # basic usage: "iostat  <options>  interval count", 
                 # like "iostat 4 5" which will give statistics data at 
                 # 4 seconds intervals, for 5 times.

# vmstat         # this tool focusses on cpu- and virtual memory usage.
                 # Usually it's used with a "delay" option like "vmstat 5" 
                 # which shows you statistics every 5 seconds. Also, it's often 
                 # used with a "delay" and "count" like "vmstat 5 3" which shows  
                 # you stats every 5 seconds, for 3 times. Then vmstat exits.

# sar            # Usually used for reporting statistics over a certain period (like last 24h).
                 # Ofcourse, "something" must be scheduled to collect data, 
                 # so that you can use the reporting tool sar to view reports over a period. 
                 # Indeed, if you install the sysstat utilities, it can be arranged 
                 # that sa1 and sa2 periodically gets fired from the schedular cron, 
                 # to build historical data. Then, you can view reports using sar. 
                 # Sar is very extensive, and deserves a manual of its own.

Usually you will get some very nice graphical tools too, to be used from a Xwin console.
Also a bit depending on your distribution, but if you have a Workstation with 
Linux configured, it's very likely it boots into a graphical Xwin environment.

Some commands from section 7, can be viewed to fall into the "monitoring" category 
of commands as well. For example, "ps -A" shows you all processes with some 
interesting attributes. (Try it !).

Please see section 7 for those other "monitoring" commands.


============================================================================
11. The shortest possible "vi" or "vim"  survivalkit in the Galaxy:
============================================================================


Sometimes you just need to "edit" some (ascii) textfile, like some configuration 
file, or a shell script, or whatever ascii file.
There are some graphical editors on Linux too, but here we touch on the traditional 
textmode "vi" editor or "vim" editor (to be used from a terminal).

You might say that the vim editor is an enhanced version of vi, and it's very 
likely that vim is available on your system. Here, we treat vi and vim as being 
"the same". If you try vim, just use that name in all examples below.

If you don't like vi or vim, you can choose an alternative editor 
like "nano" (might even be better).
However, One advantage of vi is, is that you can use it on any unix system 
as well (solaris, aix, hpux etc..).

 
--------------------------------------------------------------------
Note: if you only want to view the contents on your screen, you can also simply 
use "cat" and "more", like:

$ cat filename | more       #types the content to your screen (while "more"  
                            #sees to it that it does not be dumped all in once.)

$ more filename             #allows you to "walk" through the text file.

If you are not sure if a certain file is just ascii text, or binary, 
use the "file" command first, like so:

$ file filename             # tells you the file type. The ouput should refer 
                            # "in someway" to ascii or text, if it's ascii.
--------------------------------------------------------------------


Now, I hope that you have an innocent, harmless, text file somewhere.
Suppose you have found the file "readme" in some directory. In reality, you may 
have found some other text file, but in this example, I will pretend that 
we use some "readme" textfile.

First, copy the file to the "/tmp" filesystem. Then, switch to "/tmp".

$ cp readme /tmp
$ cd /tmp

$ vi readme               # enter "vi readme" to start vi, and open the textfile.


=> Using the Esc key: you are not in edit mode / you are in command mode

If you press the "Esc" key (at any time) you can "safely" walk through 
the text using the arrows keys (you are not in "edit" mode).
Just play around with the cursor keys (the arrow keys).

If you indeed have some substantial body of text, try:

Ctrl-D : to go down half a screen. Try that a few times.
Ctrl-U : to go up half a screen. Try that a few times.

Indeed, those two keystrokes are quite handy to move quickly in your document.
Now, just "Play" around a bit, using your arrow keys (cursor keys) and Ctrl-U/D.


=> Entering Insert mode, using the "a" key: now you can add or edit text:

Now, just as a test, place the cursor right in front of the second word, 
on the second line (it's just an example).
Press the "a" key and you will enter insert mode. Now type the word "help" 
(or whatever other word) . Press Esc again.

So, using alternatively "esc" and "a" means this:

-> If you were in in a situation where you have used Esc before, 
   and if you THEN press "a" (or "i"), you can enter text from the position 
   where your cursor was.

-> If you want to quit entering text, press Esc, and you can again safely walk 
   through the file again, while not being in the "edit" mode.

-> Deleting some characters or words: 
   Press Esc, navigate to characters or word you need to remove, 
   and press the "d" key one or more times. Hopefully, you see that the text 
   is getting removed. If you press "a", you can add text. Then, press Esc again.

Saving changes, and/or quiting "vi":
------------------------------------

If you press Esc, you leave the "editing" mode, and you go to "command mode".
If you now press ":" (the colon key), and type:

q!  (and press Enter): you quit vi, and you will NOT save 
     any changes (q from "quit")
wq! (and press Enter): you quit vi, but now you WILL save your changes 
     to the document (q from "quit" and w from "write")


Ok, this paragraph was ridiculous simple. 
It's not for nothing that quite some extensive tutorials exists on vi. 

On the most basic level, the above pointers should help you out a bit.


============================================================================
12. Some remarks about booting Linux:
============================================================================


12.1 Bootsequence in general:
------------------------------

There are some differences between the boot of Linux on bare metal, or if it 
would boot as a Virtual Machine (VM) on some type of hypervisor like 
vmware, Xen, Z etc... But, a lot is surprisingly the same.

Traditionally, it goes a bit like this:

-> BIOS: <-

The BIOS will try to find the bootloader. Nowadays, the BIOS can check several 
devices like CD/DVD, hardisk, netboot etc.. in a certain preferred order.
Traditionally, it would load the MBR (Master Boot Record) which is 
cylinder 0, head 0 and sector 1 of the first harddisk.

-> MBR: <-

The MBR contains the Partition Table for the disk, and a small amount 
of executable code and some error messages.
This executable code examines the Partition Table, and identifies the 
System Partition (or Actice Partition).
This is the partition that's used to boot the Operating System, and it contains 
the "Partition Bootsector".

-> PARTITION BOOTSECTOR: <-

The Partition Bootsector points to some essential loader of the Operating System,
like for example "ntldr" of WinNT.
From then on, control is passed to ntldr and the bootsequence of NT would start.

-> GRUB: <-

Once Linux is installed, the above sequence has been changed. This time, 
The MBR now (usually) contains "GRUB stage 1", which is a bootloader.
Once that first stage is loaded, several paths could be followed. 
However, usually, some additional sectors are read, which also contain 
"file system drivers". Then, GRUB will load GRUB "stages 1.5 and 2", 
and a configuration file from "/boot/grub".
The exact details will be left out here.

-> GRUB AND MULTIBOOT: <-

When GRUB is fully loaded, it will present a simple menu of booting 
to any Operating System, as is listed in "/boot/grub/grub.conf".
Since GRUB is so smart, it thus could boot the system to XP, or Win7, 
if that would be installed too. But usually you would go for Linux. 
An example of a grub.conf could look like this:

# cat /boot/grub/grub.conf

  # grub.conf generated by anaconda
  #
  # Note that you do not have to rerun grub after making changes to this file
  # NOTICE:  You have a /boot partition.  This means that
  #          all kernel and initrd paths are relative to /boot/, eg.
  #          root (hd0,1)
  #          kernel /vmlinuz-version ro root=/dev/sda3
  #          initrd /initrd-version.img
  #boot=/dev/sda
  default=0
  timeout=33
  splashimage=(hd0,0)/grub/dark_sun.xpm.gz
  hiddenmenu
  title CentOS 5.2 x86_64 2.6.18-92.1.22.el5
        root (hd0,0)
        kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=LABEL=/
        initrd /initrd-2.6.18-92.1.22.el5.img
  title Windows XP SP3
        rootnoverify (hd0,1)
        chainloader +1


You might notice that (hd0,0) is a way to point to the first partion 
on the first harddisk. Also, (hd0,1) then points to the second partition 
on the first harddisk.

This is why GRUB is able to let you choose to start to Linux or a system 
like Windows XP or Win7.

Example of a grub.conf with just one option:

# cat /boot/grub/grub.conf

  # grub.conf generated by anaconda
  #
  # Note that you do not have to rerun grub after making changes to this file
  # NOTICE:  You have a /boot partition.  This means that
  #          all kernel and initrd paths are relative to /boot/, eg.
  #          root (hd0,0)
  #          kernel /vmlinuz-version ro root=/dev/myvg/rootvol
  #          initrd /initrd-version.img
  #boot=/dev/cciss/c0d0
  default=0
  timeout=5
  splashimage=(hd0,0)/grub/splash.xpm.gz
  hiddenmenu

  title Red Hat Enterprise Linux Server (2.6.18-308.13.1.el5)
          root (hd0,0)
          kernel /vmlinuz-2.6.18-308.13.1.el5 ro root=/dev/mapper/myvg/rootvol
          initrd /initrd-2.6.18-308.13.1.el5.img


About grub's naming of disks/partitions:

It's something like an OS-neutral naming convention for referring to 
disk devices. Here, hard disks are always called hdN (N=0,1,..), 
floppy disks are fd, and partitions are called hd(N,M) (N,M in 0,1,..).


-> INITRD (or INITRAMFS on some systems), AND KERNEL LOAD: <-

The initial RAM disk ("initrd" or "initramfs") phase, provides for an initial root file system  
that is mounted prior to when the "real root file system" can be mounted. 
The initrd is bound to the kernel and loaded as part of the kernel boot procedure. 

Thanks to this intermediate mount, modules can be loaded to the Kernel and 
as a result, the kernel is able to make the "real file systems" available 
and get access at the real root file system.

-> INIT AND RUNLEVELS: <-

The kernel will execute the "init" process. The init process starts all 
other processes. 

The /etc/inittab file contains instructions for init.  
It contains directions for init on what programs and scripts to run 
when entering a specfic runlevel.

As of init, there might be slight variations between the different 
Linux distributions on how scripts will be executed and from which locations.

A (partial) inittab file might look a bit like this:

  # Default runlevel. The runlevels used by RHS are:
  # 0 - halt (Do NOT set initdefault to this)
  # 1 - Single user mode
  # 2 - Multiuser, without NFS (The same as 3, if you do not have networking)
  # 3 - Full multiuser mode
  # 4 - unused
  # 5 - X11
  # 6 - reboot (Do NOT set initdefault to this)
  #
  id:3:initdefault:

  # System initialization.
  si::sysinit:/etc/rc.d/rc.sysinit

  l0:0:wait:/etc/rc.d/rc 0
  l1:1:wait:/etc/rc.d/rc 1
  l2:2:wait:/etc/rc.d/rc 2
  l3:3:wait:/etc/rc.d/rc 3
  l4:4:wait:/etc/rc.d/rc 4
  l5:5:wait:/etc/rc.d/rc 5
  l6:6:wait:/etc/rc.d/rc 6

  # Trap CTRL-ALT-DELETE
  ca::ctrlaltdel:/sbin/shutdown -t3 -r now

So, suppose in inittab, the default "runlevel" is set at "3" (id:3:initdefault), 
the scripts as specified by "/etc/rc.d/rc 3" will be executed. On some systems, 
it may mean that all scripts (or symlinks to /etc/init.d) in "/etc/rc3",
with a name that starts with "S" (as from Start), will be executed.
So, it might be possible to find, say, a "S99Oracle" script, that boots Oracle.

Note however, that there might be some variations on how exactly the "rc" scripts 
are found between the different distributions.

See also Chapter 21.

Note:
-----
The former Grub was called "lilo". This bootloader was somewhat more limited 
in capabilities. It used "/etc/lilo.conf" as its configuration file.


12.2 Creating a Boot CD/DVD:
----------------------------

Here, just a few general remarks are  presented.

Option 1: using a downloaded .iso file, to create a CD/DVD:
-----------------------------------------------------------

Here, the general outline would be:

- First you find, or download, the .iso file.
- Then, you burn that .iso file to CD/DVD, using "cdrecord" or
  a graphical burning tool.
- Done.

1. Download a suitable "filename.iso" file, from the internet, 
   or other location.

2. Once you have downloaded the right .iso file, optionally check it using:

# md5sum filename.iso     # check if the downloaded file has the same 
                          # "checksum" as you saw on the site

3. Next, you need to have CD/DVD burning software, or use 
   the "cdrecord" command.

You must burn the .iso file to the writable CD, or DVD "as an image".
An .iso needs to be "burned" in a 'specific way" that expands/extracts 
the image, so that you end up with usable files on your disc. 

The burnprocess will create a bootable media automatically.
Thus, the "bootable info" on the resulting bootable DVD, 
is just part of the .iso file.

Typically, here a graphical burning tool in Xwin would be ideal. 

Ofcourse, using the commandline is possible too. 
Here, most often the "cdrecord" command is used. This command
expands the .iso file to CDR/DVD.

Hopefully, your OS has support for the drive, without needing to
install anything.

If the command "cdrecord -scanbus" shows a drive (or more drives),
you are good.
 
Using cdrecord:
---------------

If you would go into a commandline session with "cdrecord", it would 
esemble something like this:

# cdrecord -scanbus      # in order to find the dev address (if needed)
                         # See also Chapter 7.

# cdrecord -v speed=<burning speed> dev=<your_dvd_device> /path_to_iso

like:

# cdrecord -v speed=8 dev=0,0,0 /isofiles/example.iso
# cdrecord -v dev=0,4,0 example.iso

As you can see, once you have a .iso file, it is not too hard to create
a bootable CD/DVD containing the extraction of that .iso file.


Or, you simply burn it from Windows, using a graphical utility.


Option 2: Create an .iso file, then burn it to CD/DVD:
------------------------------------------------------

Here, the general outline would be:

- First you create the .iso file yourself, using "mkisofs".
- Then, you burn the .iso file to CD/DVD, for example using 
  graphical burning software or the "cdrecord"command we saw above.
- Done.

This procedure is just a tiny bit harder to perform. First, "mkisofs"
has a "not so easy" to comprehend commandline syntax, using quite 
a few parameters.

There are many ways to proceed further, also depending on your
specific distribution.

As 'handy' information for reading articles on this subject, 
we must realize that when booting from CD-ROM, 
there are couple of different "modes", among which exists:

- "SYSLINUX like", where the boot information from a "bootable floppy"
  is stored in an image file on the CD. So, if you boot from that CD/DVD,
  that image is then loaded from the CD.
  It behaves like it was a "virtual floppy". It's also called
  "Floppy emulation mode".

- "ISOLINUX like", which is "no emulation mode". The boot information is 
  stored directly on the CD, and no floppy image is needed. 
  Later isolinux makes its possible to store harddisk MBR bootinfo
  on the CD/DVD media.

- EXTLINUX like", which is a general-purpose bootloader, like GRUB.
  Nowadays, it's integrated with isolinux.

So, now it's best to find a good link describing the procedure
for your distribution.

If your purpose is to have a good Disaster Recovery procedure
for Linux on Bare Metal (using a physical machine, instead of
a using a virtualized environment), you might want to see
what tools like "Mondorescue" can do for you.

See also Chapter 25.

Note: 

if for example using VMWare ESX(i), it's really easy to copy 
the systemstate of a Linux VM to another place, since
it's only a .vmdk file.


============================================================================
13. Some remarks about Package Management:
============================================================================


This is about software management. Proper Package management allows you to 
install, update, and delete software, and to query the present state 
of packages installed.

Software and applications on Linux systems are usually organized in the form 
of "packages" that contain and describes all the relevant parts of an application 
(for example, binaries, configuration files, and libraries). 

This software needs to be installed in the correct way, at the right locations, 
and garding all dependencies.
Indeed, that is the main responsibility of a packager.

However, a few large software vendors, still use their own "installer".

- Originally from RedHat, the Redhat Package Manager (RPM) is found, or can be used, 
  on many Linux environments. You might say that RPM is a bit of a standard.

- Another popular package manager, is "YUM" (Yellowdog Updater Modified). 
  It's more like a frond-end to RPM.

- Yast (yast2) is found on SuSE, but is also RPM compatible.
 
- There are quite a few others like apt-get, dpkg etc..

Some rpm examples:
------------------

# rpm -ivh  packagename          # installing the package, using -i
# rpm -Uvh  packagename          # Upgrading a package, using -U, 
                                 # is similar to installing one, 
                                 # but RPM automatically un-installs existing 
                                 # versions of the package before 
                                 # installing the new one
# rpm -e packagename             # remove (erase) a package, using -e

# rpm -qa                        # querying on all packages. It uses the 
                                 # "/var/lib/rpm" repository/database
# rpm -q packagename             # querying on a specific package
# rpm -qa | grep whatever        # querying all, but filtering on 'whatever'

# rpm -qf /usr/bin/whateverfile  # querying to what package a certain file 
                                 # belongs, using -f

# rpm -qlp whatever.rpm          # showing all files in the package

# rpm -Va                        # validates all packages

Some yum examples:
------------------

# yum install package        # installing
# yum -y install package     # like above, but without prompting for confirmation
# yum remove package         # removing the package
# yum update package         # updating the package
# yum search mypackage       # searching for a package


As you can see, it's not too hard working with some package management tool.
However, some operations might take some time, while it seems that you do not 
get response back in a timely fashion.
It's probably wise "to give it some time", because in a few cases, if you
interrupt the process, it might leave the repository in an indeterminate state.


============================================================================
14. A few words on Linux logs:
============================================================================


Some main Linux logfiles, and how to view them.

Remark 1:
---------
The "syslog" subsystem / "syslogd" daemon, logs all sorts of system messages, 
from informational- to critical messages.
The "/etc/rsyslog.conf" configurationfile determines (partly) what events 
are going to be logged and where.

Remark 2:
---------
Some of the logfiles maybe under the control of logrotate, meaning that 
multiple files might be present using similar names (but ending in .1, .2 etc.., 
or other extension, and possibly compressed too).
See "/etc/logrotate.conf", or "/etc/logrotate.d/syslog" (or other file) 
for configuration of "logrotate".

Usually, there exists a daily cron job, like for example scheduled like so:

05 6 * * * /usr/sbin/logrotate /etc/logrotate.conf

Here, logrotate runs one a day at 06:05, taking its settings 
from /etc/logrotate.conf 

 
=> The "/var/log/messages" file is one important one. 
   It contains global (ongoing) system messages.

# cat /var/log/messages | more      # view the messages logfile
# more /var/log/messages            # view the messages logfile
# tail -f /var/log/messages         # View New Log Entries as they are  
                                    # happening in real-time (using tail -f). 
                                    # Use Ctrl-C to stop viewing 


=> The "/var/log/dmesg" file contains (among other stuff) messages 
   the system issued at boottime, and some kernel output.

# dmesg                             # view the dmesg logfile
# cat /var/log/dmesg | more         # view the dmesg logfile


=> The "/var/log/lastlog" file, 
   "/var/log/faillog" file, and 
   "/var/log/auth.log" file.

"/var/log/lastlog" - contains the recent login information for all the users. 
Use the "lastlog" command to view this log.
# lastlog

"/var/log/faillog" - contains user failed login attemps. 
Use the faillog command to view this log.
# faillog 

"/var/log/auth.log" - contains logged auth events, like logins, 
the use of sudo, remote connections etc..

# grep sshd /var/log/auth.log | less    # Here you only are interested 
                                        # in sshd events, thats why 
                                        # you grep on that string.
# cat /var/log/auth.log | more          # View all entries


=> "/var/log/daemon.log" - contains messages and events from system 
   and application daemons. 

Note that there usually are multiple files like /var/log/daemon.log.1 
and compressed ones after ".1".
Just use one of our familiar commands to view the uncompressed 
files (like "cat", or the "more" command)


=> The "/var/log/kern.log" (or kernel or kernel.log) - contains kernel messages.


============================================================================
15. A few words on cron:
============================================================================


Cron is the default scheduler in Unix and Linux. It uses the socalled "crontab"
files which define which jobs are called, and the schedules of those jobs.

You can think of all types of jobs, like backup jobs, certain print jobs, whatever..
Often, shell scripts are scheduled, but true programs can be scheduled too.

Usually, root and some other admin accounts have a number of scheduled tasks.
Here you might think of backup jobs, and all sorts of "housekeeping" tasks,
like archiving old logfiles etc..

But any account, if authorized, can have it's own crontab.


=> Using cron

For example, suppose you use the account "oracle", and you have
logged on to the system, then:

- If you want to see your schedule tasks, use the command "crontab -l":

# crontab -l           # view the scheduled tasks

- If you want to edit the schedules, or add/remove jobs, then use "crontab -e":

# crontab -e           # edit your scheduled tasks 

When you edit your crontab file, using "crontab -e", vi (or another editor)
will start and you are able to alter date/times of schedules, 
add or remove jobs etc..

If you just list, or edit your crontab file, then typically you will see records
like the following example:

minute  hour  day_of_month  month  weekday  command
15       4     *             *       *       /home/harry/bin/maintenance.sh

From "left" to "right", the first 5 field simply define the schedule 
of the command. You see the following fields:

-minute (from 0 to 59) 
-hour (from 0 to 23) 
-day of month (from 1 to 31) 
-month (from 1 to 12) 
-day of week (from 0 to 6) (0=Sunday) 

A "*" means "all" or "indifferent".

So in the example above, the shell script "/home/harry/bin/maintenance.sh"
is scheduled to run once, for every day, at 04:15 h.

minute  hour  day_of_month  month  weekday  command
15       4     *             *       1       /root/archive.sh

In the example above, the script "/root/archive.sh" only runs 
on Monday (day of week=1, that's monday), at 04:15h.

Be carefull Not to use * * * * *, because that would mean
that a job is scheduled to run every minute, every hour, every day,
unless you really meant it to be that way.

See, using cron is really easy. However, you need to be a bit handy
with your editor (usually "vi") in order to add/remove or modify records.

Here is another example:

minute  hour  day_of_month  month  weekday  command
30      18    7             *      *       /root/maint.sh 2>&1 >> /log/maint.log

Here, the script "/root/maint.sh" only executes once at the 7th day of each month,
at 18:30h.

Notice the "2>&1 >> /var/log/maint.log". It means that standard error (2) is redirected 
along with standard output (1), in this case both to the /log/maint.log logfile.


=> Starting and stopping cron:

On many distro's, root can use:

# /etc/init.d/crond start               # start crond daemon

# /etc/init.d/crond stop                # stop crond daemon

Usually, there is no need for this since crond will start at bootime, and
only in some rare cases you need to restart crond.

In some environments, the following can be used to start/stop cron:

# service crond start
# service crond stop


=> Allow a login to use cron:

If you are root, and an account needs to use the schedular,
then add the name in the cron.allow file.

Usually, there exists the "/etc/cron.allow" and "/etc/cron.deny" files
On some other systems, take a look in /var/spool/cron
to locate the cron.allow file.


============================================================================
16. A few words on User Accounts:
============================================================================


On any distribution, graphical tools are available for creating 
users and groups if you want to use Xwin environments.

Usually, authorisations are granted to groups, where each member
of such a group will inheret these permissions.

Anyway, from the cli, you can use the "adduser" or "useradd" commands
in order to create a new user.
You can use "usermod" to alter user properties.

The exact mechanics will vary somewhat between distributions.

A few examples:

=> "useradd" and "adduser" to create new (local) users:

# useradd harry   # adds the account with defaults

# useradd -s /bin/bash -m -d /home/harry -c "King Harry" -g root harry 

Usually:
-s : Login shell for the user.
-m : Create user�s home directory if it does not exist.
-d : Home directory of the user.
-g : Group name or number of the user.
 UserName : Login id of the user.

The "adduser" command will interact with you, so that the system will ask
you to provide values.

# adduser harry

.. informational messages + questions asked...


=> "usermod" to modify an existing account:

# usermod -d /home2/albert albert  # alter homedir
# usermod -e 2012-10-20 albert     # disable account as of 2012-10-20

Many more options are available.

When a user is created, it will be known to the system by it's UID.

Although you are also able to change a user's UID, with "usermod", you should
be VERY reluctant to do so, since all objects that the user created in the past
are known to the system bij it's UID (user ID) and it's GID (group ID).

=> "addgroup" and "groupadd" to add Groups:

Similar to the adduser and useradd commands, these can be used to create Groups.

# groupadd dba

=> Showing your UID or that of another user:

# id
# id harry


=> the "/etc/passwd" and "/etc/group" files
 
Local Users are described in the "/etc/passwd" file.
Local Groups are described in the "/etc/group" file. 

The /etc/passwd is an ascii file, containg a list of the accounts, 
giving for each account information like the user ID, group ID, home directory,
which shell the account uses.
Passwords are not stored in /etc/passwd, but traditionally in "/etc/shadow".

You can always view the contents of "/etc/passwd", or search for a string:

# cat /etc/passwd
# cat /etc/passwd | grep -i Albert
# grep Albert /etc/passwd


Ofcourse, Linux can function as an LDAP Server for central authentication
and management of accounts, as well as that it can be integrated in
an existing Directory Service.


============================================================================
17. The Linux standard filesystems:
============================================================================


17.1 Logical layout of the root "/" filesystem:
-----------------------------------------------

When a Linux distribution is installed, some standard filesystems and 
mountpoints will be created. One of them is the "root" / fileystem.

When you take a look at it, using some graphical utility, or using just the  
"ls" or"ll"command, logically it looks like this.

Fig. 1.

   |-/bin  (user binaries)
   |-/sbin (system binaries)
   |-/etc  (configuration files, rc scripts)
   |-/dev  (device files)
   |-/var  (variable stuff, like logs. Sometimes apps are installed there too)
/ -|-/usr  (user programs)
   |-/proc (process information, system information)
   |-/boot (files needed at boot)
   |-/home (user home directories)
   |-/root (home dir of root)
   |-/tmp  (place for temp files)
   |-/lib  (system libraries)
   |-/opt  (optional location for programs)
   
Most directories contain a lot of subdirectories as well, so actually it's a whole "tree".

A "path" to a certain file, say the file "messages" in "/var/log/",
starts from the root "/", then the directory "var", then we need to go to "log".


17.2 Is it a directory in the "/" filesystem, or a seperate filesystem?:
------------------------------------------------------------------------

Note that you cannot know beforehand (by browsing the tree), if
"/home" is just simply a directory in "/" (root), or a seperate filesystem
mounted on "/home". The same is true, for example for "/var".

Well, the question is simply solved if you just use the "mount" command, 
or the "df -T" command, or simply take a look at the "/etc/fstab" file.

All three methods will cleary show you the different filesystems, and
where it's mounted on.

For users, it's not an interresting question. For admins it is.

So, what's the difference anyway? An explanation now follows.
If you already know this stuff, it might be boring, and you might
skip to the next Chapter. If you want to hear it: read on.


Case 1:
-------

Let's consider a traditional bare metal install of Linux on a PC.
Suppose you have installed a second IDE harddisk, known to Linux as
/dev/hdb.

If you partition it using fdisk,

# fdisk /dev/hdb

Then fdisk asks you a couple of questions, and you end up with
a partition "/dev/hdb1".

Then you create, a filesystem on that partition, in order to make
it usable:

# mkfs.ext3 -b 4096 /dev/hdb1

Please note that any distribution has it's own tools, so in your
case maybe another command must be used. Anyway "mkfs" or "crfs"
will allmost always work.

Ok, now we have a formatted filesystem (of type "ext3"). It's nearly usable.
But not yet. Now we make it "alive" by "mounting" it to a "mountpoint".

Suppose I create a directory "/data" (which will be my new mountpoint)
Now let's mount the filesystem:

I edit the "/etc/fstab" file (which registers all mounts) and
place a new record in it, like:

/dev/hdb1     /db         ext3      defaults        1 1

(there are more records in fstab, similar to this one).

Now I can simply mount the filesystem, by using:

# mount /data

And its available.

So what happened here? /data looks like a simple directory, which it is,
but its actually also a mountpoint for a complete seperate filesystem.

Case 2:
-------

I know that the following is ridiculous simple, but take a look at 
this scenario:

I could create a directory "oracle" in "/", so now we have "/oracle".
In /oracle, I could place all sorts of subdirectories and files.
I could do something similar in "/data" of case 1.

So whats the difference?

"/data" corresponds to a whole seperate filesystem (in this case, on a second harddisk).
"/oracle" is just a directory, within the "/" filesystem.

There is a difference here. You agree?

So, what about /usr, /bin etc..? 
Are those all seperate filesystems on seperate partitions, or are they simply
directories within "/"?

The answer is: "/" root corresponds to a filesystem, and with many
distributions, almost all of the directories listed in Figure 1 above, 
are indeed directories, and not mountpoints (where "seperate" filesystems 
are mounted on).
Usually, only a few "directories", like "/home", are associated with seperated
filesystems.

However, with some installations, you might see that /var, /usr, /opt, /home,
(and possibly others), are separate filesystems, using these mountpoints.
 
Actually, there is nothing really special about it. Most unixes
do it "their own way". For example, in AIX, Solaris and others, "/usr", "var"
etc.. sits in their own filesystems, assoiciated with it's own partition.


17.3 The "/etc/fstab" file - repository of filesystems and mountpoints:
-----------------------------------------------------------------------

In Linux, the standard mounts are defined in the "/etc/fstab" file.

If you have created a new filesystem, and you want that Linux mounts it
at boottime, you need to enter a record in that file.

Here are 3 examples:

Example 1:

LABEL=/          /                       ext3    defaults        1 1
LABEL=/boot      /boot                   ext3    defaults        1 2
devpts           /dev/pts                devpts  gid=5,mode=620  0 0
tmpfs            /dev/shm                tmpfs   defaults        0 0
proc             /proc                   proc    defaults        0 0
sysfs            /sys                    sysfs   defaults        0 0
LABEL=SWAP-sda3  swap                    swap    defaults        0 0


Example 2:

/dev/hda2    /             ext2        defaults            1 1 
/dev/hdb1    /home         ext2        defaults            1 2 
/dev/cdrom   /media/cdrom  auto        ro,noauto,user,exec 0 0 
/dev/fd0     /media/floppy auto        rw,noauto,user,sync 0 0 
proc         /proc         proc        defaults            0 0 
/dev/hda1    swap          swap        pri=42              0 0 

Example 3:

# <file system>        <dir>         <type>    <options>             <dump> <pass>
tmpfs                  /tmp          tmpfs     nodev,nosuid          0      0
/dev/sda1              /             ext4      defaults,noatime      0      1
/dev/sda2              none          swap      defaults              0      0
/dev/sda3              /home         ext4      defaults,noatime      0      2

Note that you usually do not see seperate filesystems for /usr, /var etc..
But, usually "/", "/boot", "/home", and swap do "have" their own partitions.


Note:

Here is an example of the standard filesystems in AIX.
AIX does not use an "/etc/fstab" file, but the mounts and filesystems are
registered in "/etc/filesystems" which has a similar function.

Note that if we take a look at hdisk0 of rootvg, we can see that /usr and /var
have their own partitions/filesystems (Logical Volumes), and are not simply part
of "/". They are seperate filesystems, "simply" mounted on /usr and /var.

# lspv -p hdisk0
hdisk0:
PP RANGE  STATE   REGION        LV NAME             TYPE       MOUNT POINT
1-1       used    outer edge    hd5                 boot       N/A
2-48      free    outer edge       
49-51     used    outer edge    hd9var              jfs        /var
52-52     used    outer edge    hd2                 jfs        /usr
53-108    used    outer edge    hd6                 paging     N/A
109-116   used    outer middle  hd6                 paging     N/A
117-215   used    outer middel  hd2                 jfs        /usr
216-216   used    center        hd8                 jfslog     N/A
217-217   used    center        hd4                 jfs        /
218-222   used    center        hd2                 jfs        /usr
223-320   used    center        hd4                 jfs        /
..


============================================================================
18. Some remarks about using Linux as a Virtual Machine:
============================================================================


There are quite a few options on running Linux as a Virtual Machine (VM).

If we would think in a "practical way", you might devide those options in 
"Desktop" solutions and "more Business-like" solutions.

- An example of a Desktop solution might be a XP or Win7 PC (or Win Server) 
  where a product like "VMWare Player" is installed (or VMWare Server), 
  which makes it possible to run a Linux VM, within Windows.

- More business-like solution would for example be an ESXi Host (VMWare) where 
  possibly a large number of Windows- and Linux VM's are running 
  (with more options like centralized Administration of ESX Hosts, 
  and "Live Migration" of a VM to another ESXi Host).
  Often, these products can be considered to use a real "hypervisor".

Some would say that the distinction would be better described with:

- A "hosted" environment where a Host OS (Windows or Linux)  can "host" 
  one or a few Guest OS'ses.
- A true specialized hypervisor kernel on a physical machine (like ESXi or Xen), 
  which supports running many VM's.

----------------------------------------------------------------
For a general description of Virtualization: see the note below.
----------------------------------------------------------------


When you consider a Desktop solution, like a Windows Host OS, 
it might add a lot of value to your PC:

. For example, if you have a Win7 PC or so, and you just want to try 
  or test some Linux distro, you can run such a "Guest OS" without 
  (relevant) modifications to your Host OS.
. As another example, maybe your standard Desktop is Windows, 
  but suppose you must write a lot of linux shell scripts,
  then loading a Linux VM might well be a solution.

I agree that it sounds more "right" to run a "Windows VM" from Linux, 
but a fact is that many folks do it the other way around.


-> A few examples of "Desktop Virtualization" Products 
   (for Windows) which can run a Linux VM:

1. VMWare Server      : You can run one, or a few Linux VM's 
                        from Windows or Linux. However, support has ended.
2. VMWare Workstation : Might be viewed as the succcessor of VMWare Server.
3. VMWare Player      : Simple way to run one or a few VM's. However, some 
                        advance options are not present,
                        like connecting to vSphere, upload VM's, snapshots etc..

-> A few examples of "Business Virtualization" Products 
   which can run Linux VM's:

4. VMWare ESX(i)   : One of the most implemented solutions 
                     in business infrastructures.
5. Xen Server      : Another popular solution in business infrastructures. 
6. Red Hat RHEV    : Yet another popular solution in business infrastructures.
7. IBM Power and Z : IBM Z mainframes, and Power Systems (like system p), 
                     can run many Linux VM's (lpars).

This is list is far from complete ! And please also realize that it's only focused 
on what virtualization products support Linux VM's. There are many more 
virtualization platforms, like hpux npars, solaris containers etc.. etc.. 

When you consider the popular VMWare products, among other files, 
typically you would see these files:

- .vmx file    : This one stores configurations with respect to the VM.
- .vmdk file(s): This is a virtual disk file, which stores the contents 
                 of the virtual machine's hard disk drive. 

There are other files as well, like a .lck (lock file) and .log files and others.

For some Linux distributions, you can download a vmx file and vmdk file 
from the internet, and instantly create a VM in a Test/Play configuration,
using "VMWare Player" or "VMWare Workstation".


Note: Types of Virtualization:
------------------------------

To describe the virtualization techniques in general, most people use the
following classification to describe the business solutions:


=> Full Virtualization:

An unmodified Operating system, like Linux, can be installed in a VM.
The OS is "hypervisor unaware". In short: it "thinks" that it runs on
a true  physical machine.
The hypervisor needs to intercept driver actions and all calls to hardware. 
Also, the hypervisor emulates the whole bunch (like the BIOS, all hardware).
Often a process called "binary translation" takes place by the hypervisor.
In general it might be percieved that there is a lot of overhead
at the hypervisor level.


=> Para virtualization.

A modified OS (with respect to kernel and drivers), like Linux, can be installed 
in a VM. The OS now is "hypervisor aware" and makes use of the API of the hypervisor.
In short: the guest OS "knows" that it's running in a VM.
The overhead on the hypervisor is less when compared to (traditional)
full virtualization.

- For Linux VM's, the socalled paravirt-ops code ( pv-ops) was included in the 
  Linux kernel as of the 2.6.23 version.

- Nowadays, for Windows VM's, Windows either need the socalled "enlightened" drivers,
  or the "Virtualization Vendor" wants you to install it's own
  paravirtualized drivers in Windows (after Windows was installed), 
  or Windows runs under full virtualization (HVM, see below).


=> Hardware assisted virtualization (often called "HVM"):

Instead of binary translation of calls of the VM, like in use with the traditional
"full virtualization", the hypervisor (or VMM) now hands off the effort to hardware 
capable to simulate a complete hardware environment.
The OS may be unmodified, just like with traditional full virtualization.

This type of virtualization is often considered to be "a special case" of
full virtualization.

Hardware-assisted virtualization requires explicit support in the host CPU,
and in the case of Intel, the socalled "VT" (VT-d, VT-x) capabilities needs to be present.

Often Hardware assisted virtualization is percieved as the fastest virtualization
technology, however many tests seem to point to the fact that the gains are not
spectecualr over software based emulation (for now).


============================================================================
19. A few words on Linux VM's under Xen and XenServer :
============================================================================

This section only serves to provide for a very, very, light-weight impression
of Xen/XenServer.


- Xen:
Xen is the fundamental (and original) hypervisor. It main purpose is to support VM's
and to provide for isolation between them. Xen is open source (GPLv2) and is 
managed by Xen.org. 

Later, Citrix maintained their own "evolution", called XenServer.

- Citrix XenServer:
Citrix XenServer comes as a free edition, and as a commercial one with support.

XenServer still includes "Xen", the fundamental hypervisor.

A Xen host will run VMs, which are called "Domains". The first one that boots,
is called "Dom0" (Domain 0") and enables you to control the other VM's.
Any other VM is unprivileged, and are known as a "domU" or "guest". 

Dom0 is a Linux machine, to which you can logon using the "console" or
using a remote connection. Many Linux distributions are suited for Dom0
like Debian, Fedora, OpenSuse.

From Dom0, you can use specific commands to create, start, stop, list VM's,
as well as other commands. Grapical tools are available as well.

The other VM's usually runs Linux distributions or Windows Server editions.

There were some efforts done to split "Dom0" into Dom0 and "driver domain"
which are is unprivileged Xen domains that has been given responsibility to
a specific piece of hardware. We will not discuss that further in this 
simple note.

Traditionally, the Xen or XenServer architecture resembles the following figure:

Fig. 2.

    ---------     ------------------------------------------
    |Console|     |   |Dom0   |    |DomU    |   |DomU    | |
    |       |---- |---|Linux  |    |        |   |        | |
    ---------     |   |       |    |Guest OS|   |Guest OS| |
                  |   |XAPI   |    |VM1     |   |VM2     | |
                  |   |Xend   |----|        |   |        | |
                  |   |       |    |        |   |        | |
                  |   |-------|    |--------|   |--------| |
                  |   |drivers|    |Virt Drv|   |Virt Drv| |
                  |-----------------------------------------
                  |                XEN hypervisor          |                   
                  ------------------------------------------
                  |     HARDWARE                           |
                  ------------------------------------------


From Dom0, the commandline can be used to control the other Domains.

Formerly, from Xen, the "xm" commandset was available.
However, as of XenServer on, you should use the "xe" commandset.

Usually, Volume groups are created from phyical volumes (disks).
Once the Volume Groups (VGs) exists, Logical Volumes (LVs) are created, which
will form the filesystems for the Virtual Machines.


=> Some examples of (older) the xm commandset  
   or "Xen commandline user interface"

# xm <subcommand> <domain-id|domain-name> [OPTIONS]

Where a few examples of the "subcommand" is listed below (like shutdown), 
and to specify on which domain the command is in effect, the "domain-id", 
or the "domain name" should be used. 
   
# xm list                 # list all VM's (Domains) on this Host
# xm console domain-id    # connect to a DomU
# xm create testdom [-c]  # creates a DomU (with per default file in /etc/xen)
# xm create Fedora4       # create a VM
# xm shutdown Fedora4     # shutdown a VM
# xm suspend 166          # suspends the domain with id 166
# xm resume 166           # resumes the domain with id 166
# xm destroy Suse5        # remove a VM

The list command for example can show you a list similar to:

Name                      ID Mem(MiB) VCPUs State  Time(s)
Domain-0                   0       98     1 r-----  5068.6
Fedora3                  164      128     1 r-----     3.2
Fedora4                  165      128     1 ------     0.6
Fedora5                  166      128     1 -b----     3.6
Suse10                   168      100     1 ------     1.8


=> Some examples of (newer) the xe commandset  
   or "XenServer commandline user interface"

The syntax of XenServer xe CLI commands is:

# xe command-name argument=value argument=value ... 

A few examples:

# xe vm-list                        # list all VM's (Domains) on this Host
# xe vm-start vm=name               # start this VM
# xe vm-destroy uuid=UUID           # remove the VM


============================================================================
20. A few words on Linux VM's under VMWare :
============================================================================

This section only serves to provide for a very, very, light-weight impression
of VMWare ESX/ESXi and Infrastructure.


20.1 Birds-Eye view on Host OS'ses, and "Management" infrastructures:
---------------------------------------------------------------------


=> Host Operating system: VMWare ESX or ESXi

ESX(i) refers to the Host Operating System, which is the hypervisor and all
supporting software on a physical machine, that makes it possible to run VM's 
on this Host.

Multiple ESX(i) Hosts can form a Cluster.

There are a "few" differences between ESX and the ESXi Operating systems,
which will be addresses below.


=> Management frameworks for ESX and ESXi

ESX:

In a larger ESX environment, which may have many ESX Hosts, you often will find
that the "VirtualCenter" Management Server is implemented, which is a Win2K3 
Server with a SQL Server based repository. 
Windows AD authentication is possible which determines your permissions 
in the infrastructure.

VMWare Admins may use a Windows based "VI client" to connect to the
VirtualCenter and create VM's, stop/start VM's, change resources and the like.
The Admins also can ssh to any ESX Host, and use a command line interface
to manage VM's.

ESXi:

Quite similar to above, the Management framework is improved and renamed
to "vSphere" using the "vCentre Management Server".

VMWare Admins may use a Windows based "vSphere client" to connect to the
vCentre Management Server and create VM's, stop/start VM's, change resources 
and the like.
The Admins also can ssh to any ESXi Host, and use a command line interface
to manage VM's.


20.2 ESX and ESXi:
------------------

- ESX 

ESX (on a bare metal Host) actually uses a slightly modified Linux OS 
(for boot) which is called the "Service Console" or "Console Operating System (COS)".  
This console uses the Linux kernel, device drivers, and other usual software 
like init.d and shells.
This Service Console (a full Linux machine) can be seen as the primary environment 
for one ESX Host. You can for example just ssh to it, and (for example) also find 
the usual useraccounts as root and other typical Linux accounts.

During boot of the Service Console (Linux), the VMWare "vmkernel" starts from "initrd" 
and it starts "to live" for itself, while Linux boots futheron. 

The vmkernel with all suppoting modules can be seen as the "Hypervisor" which is used 
for "virtualization". It's important to note that vmkernel is not "just" a 
Linux kernel module.

After a full boot sequence, the state of affairs is that the vmkernel may be viewed 
as *the* kernel while the (Linux based) Service Console may be seen as the 
first Virtual Machine on that ESX Host.
The Service Console then can be viewed as the "management environment" 
for that ESX Host.

It's indeed obvious that a VMWare Admin can ssh to the Service Console, and
perform CLI management of that ESX Host.


- ESXi 

ESXi can be viewed as the successor of ESX.
ESXi is a much smaller, and faster loading system. First of all, it does 
not use a Linux boot.
So, also, the (Linux) "Service Console" does not exist anymore.
Many people view ESXi as almost just firmware, and the change from ESX 
is indeed quite large.

ESXi is essentially a vmkernel microkernel with loaded modules for supporting services, 
and there is no binding with a "Console Operating System" or "Service Console"
as was the case with ESX.
This "lighter" (smaller footprint) environment is considered to be 
faster and more secure (lower surface).

It's still possible that a VMWare Admin can ssh to the VMkernel environment, and
perform CLI management of that ESXi Host.


Release:		Year:	 Management Frmwrk	
VMware ESX Server 1.x	2001		
VMware ESX Server 2.x	2003		
VMware ESX Server 3.x	2006	 VirtualCenter	
VMware ESX Server 4.x	2009	 vSphere	
																
VMware ESXi 3.5		2008	 vSphere	
VMware ESXi 4.0		2009	 vSphere	
VMware ESXi 5.0		2011	 vSphere

The following dummy figure might help in understanding the ESX 
architecture.

Fig. 3.

  ----------------------------------------- 
 | ESX Host                               | -- ssh connection
 |                                        |    to Linux Console
 |                         ------  ------ | 
 |         agents          | VM |  | VM | | -- Agents communicate
 | ----------------------  |    |  |    | |    with VirtualCenter
 | |Linux               |  ------  ------ |    or vCentre
 | |System Console      |                 |
 | |                    |------------------
 | |-kernel             | VMKERNEL        |
 | |-syslog             | vmklinux        | 
 | |-accounts like root |                 |
 | |-some typical       |                 |
 | | filesystems like   |                 |
 | | /etc, /var/log     |---------        |
 | |-drivers                     |        |
 | |--------------------------------------|
 |                                   CPU  |
 | HARDWARE                        Memory |
 |----------------------------------------| 

	
20.3 Command Line interfaces:
-----------------------------

It can be quite confusing to understand and have a good overview
of all CL interfaces as they emerged (and depreciated) with
each new ESX or ESXi version.


Here are a few examples


A few typical cli's for an ESX v3 environment are:

vmkfstools : used mainly for disk/ file system management 
vmware-cmd : used mainly for VM operations
esxcfg-*   : used mainly used to configure ESX 
            (or use the graphical VI client or vSphere client)

A few typical cli's for an ESX v4/v5 environment are:

vim-cmd   : used mainly for VM operations
esxcli    : used mainly used to configure ESXi


If using a modern vSphere infrastructure, a VMWare Admin might also use
"Powershell" for management of ESXi and Virtual Machines.
Presently, CmdLets are available for that purpose.


============================================================================
21. A few words on LVM, SAN, and filesystems:
============================================================================


21.1 Get current info of diskdevices:
-------------------------------------

From section 4, we have seen some commands to retrieve disk information
from the system. Here a few commands are listed again:

# fdisk -l                           # or use it like "fdisk -l /dev/sda"
# lshw -class disk
# sfdisk -l 

# cat /proc/scsi/scsi      
# lsscsi -c                          # has some interesting switches like
                                     # -l (long), -d (show magic numbers)  

# ls /sys/class/scsi_host            # you might see output like
                                     # host0 host1 host2 host3 

# lsblk -f                           # if available on your system,
                                     # it shows a tree of partitions
                                     # and filesystem types

Especially "lsscsi" might be usefull, since it shows information
of devices as ATA, SCSI, Fibre channel (FC), iSCSI and the like.  
(you might need to install it first)


21.2 Filesystems:
-----------------

Once a new local disk is reckognized by the system, or what's more likely,
a LUN on a SAN was made available, you then need to create a "filesystem" 
on that device before you can "mount" that filesystem and make it ready for use.

You know that a "filesystem" is associated with any sort of storage device,
like a physical disk. When you create a filesystem on a disk, the OS will
organize it in allocation units, create specific areas for metadata (that is: 
sort of "bookkeeping" data structures), it does some sort of integrety check, 
and ultimately, makes it available for storage of files.

There are many types of filesytems, where most have similar properties.
But the newer ones have often much more extended support (for example for
storing very large files, and supporting large partitions).

Although all filesystems are all quite similar in the basis funcionality, some 
fileystems are better in some specific characteristic. 

- For example, a certain filesystem might be better in "journaling" 
  (sort of logging) than others.
- Or, for storing certain databases, you might have a preference for some filesytem
  over others.
- As another example, there are also filesystem which are better equiped for
  "multi-node" access (clustering) than just the "normal" filesystems.

So, in most Operating Systems, you have a lot of choice.

Using the mount command, or df -T, you can quickly view what filesystem types
which are mounted on your system.

# df -T
# mount

The most popular filesystems found in Linux are:

ext2, ext3 and ext4
XFS 
JFS 
ReiserFS

And in certain cases, you might want to use special filesystems from 
certain Vendors, like:

OCFS     - Oracle Clustered Filesystems might be preffered in using Oracle clusters
OCFS2
GPFS     - General Parallel File System from IBM, to be used in clusters.
VxFS     - from Veritas
GFS      - from RedHat
FAT,NTFS - Microsoft


21.3 Create Filesystems:
------------------------

=> Local disk:

So, in general, suppose you get a new local disk. Then what?
Suppose you have "/dev/sda" as a new local SCSI disk.
Then you would follow an approach like:

# fdisk /dev/sda             # use fdisk to create partitions 
                             # like sda1, sda2

Then create a filesystem on, for example sda1. If you want ext2,
and mke2fs is available, you could do this:

# mke2fs /dev/sda1 2048256 

or like so:

# mkfs -t ext2 -b 4096 /dev/sda1 

or, if If you want ext3, you could do this:

# fdisk /dev/sda
# mkfs.ext3 /dev/sda1


Then you mount the new filesystem on a suitable mount point.

# mount /dev/sda1 /data

=> LUN from a SAN:

So, in general, suppose you want to use a new LUN from a SAN. Then what?

Suppose it's an FC SAN. You'll need to install an FC card in your Linux box 
and load the appropriate driver.

Then you'll need to configure the SAN gear to export LUNs. That is
create the LUN, do the zoning and mapping (usually done by a Storage Admin).

Once that's done, your Linux box should see the LUNs as Linux
/dev/sdXY devices (just like a SCSI disk) that you can make filesystems on
and mount them as usual.


21.4 Block and character devices:
---------------------------------

If you take a look in the "/dev" directory, and make a listing using the
special files over there, you might notice the "b" or "c" in front of the 
filemode or permission, like "brw-rw-rw-" or "crw-rw-rw-".

One thing to understand is that these files are NOT the drivers for the devices.
There are more like "pointers" to where the driver code can be found in the kernel.
For example:

brw-rw-rw-   2 bin  bin    2,64  Dec  8 20:41 fd0

Here the "magic numbers" 2,64 tells you on which address the driver for "fd0" 
can be accessed. 

But what does the "b" or "c" tells us then? It shows us if the device is
a "block" device or "character" device.

A character device (file) is something that just gives a stream of characters
that you read from or write to.

A block device (file) is something that uses whole blocks to read to the cache
and thats why it is neccessary for disks.


21.5 Detecting new disk devices, and bus-scanning:
--------------------------------------------------

If, for example, a new LUN is made vailable, how do you make Linux to detect
the device? A reboot usually scans all busses, but thats often not an option.

Listing disks by using for example "fdisk -l" often does not help.
The following might help:

(1):

On some distributions, you might find a script that will try to scan for
new devices. For example, on RedHat, the following script exists:

rescan-scsi-bus.sh

On your specific system, a similar script might exist too.
In general however, there usually are some limitations in using
such scripts, so you need to scan the documentation of your system.

(2):

To initiate a SCSI bus rescan, type either:

echo "1" > /sys/class/fc_host/hostX/issue_lip

where X stands for the SCSI bus you want to scan.
The command is essentially a bus reset, so do not use it on a busy system. 

or use:

echo "- - -" > /sys/class/scsi_host/hostX/scan 

where again, X stands for the SCSI bus you want to scan.
The command is essentially a full rescan of devices on that host.

Depending on driver version, and kernel version, you might just need one command,
or maybe even both.

Usually, the scan commands are followed by a setup utility from the SAN
manufacturer, like for example "powermt config".

In many older situations, the /proc filesystem was used to communicate
to kernel and drivers, like in for example:

echo "scsi-qlascan" > /proc/scsi/qla2xxx/X

where X again is the hostnumber.

Whichever is appropriate for your specific system is ofcourse a bit
hard to tell. I am afraid you need to spend some time investigating.


============================================================================
22. Some remarks on how to autostart daemons on boottime:
============================================================================

When considering the question on how you can enable or disable the start
of daemons (services) on Linux, a whole bunch of answers exists, which is
also dependent on your particular distribution.

Two "classical" solutions are:


1. Use the rc scripts:
----------------------

In general however, the following action should do the trick:

To start a program automatically, go to the directory "/etc/init.d" and create
a script that starts the program. 
Then make a link in "/etc/rc.d/rc3.d" that points to that script.

So, suppose yhat you want, say, "apache" to start at boot time.
 
Suppose you created a script "httpd" in "/etc/rc.d/init.d".
Usually, such a script can take one of two parameters, namely 'start" and "stop".

Now, create a symbolic link at /etc/rc.d/rc3.d folder with a name like
for example "S80httpd" that is linked to /etc/rc.d/init.d/httpd
 
#cd /etc/rc.d/rc3.d
#ln -s ../init.d/httpd S80httpd 

Ofcourse keep in mind that scripts need to be executable, so set the filemode
as approppriate.

Note that I created a link in "/etc/rc.d/rc3.d", because usually the default runlevel
often is "3". 

But some distributions use a default of 2, and some even treat 2-5 as 
being the same.


2. Use the "rc.local" file:
---------------------------

Most Linux distributions use the "rc.local" file, which content will be
executed at the end of the system initialization. This file is a bit like the
"last minute" multi-user startup file.

In "rc.local" you can put the desired startup commands.

Usually, the rc.local can be found in "/etc", or otherwise try "/etc/rc.d".

If you don't have it, you might try:

# touch /etc/rc.local
# chmod 700 /etc/rc.local

Then put the command lines to start daemons (or services) in /etc/rc.local.
For example, you could place the following record in rc.local:

/etc/init.d/rc.d/httpd start


The "rc.sysinit" file:
----------------------

You should not use the "rc.sysinit" file. It's more oriented for
setting system parameters and settings, like network parameters, 
possibly swap, and many more...

When init awakes it will parse the following:

init      
-> reads the inittab (or init) file
-> runs /etc/rc.d/rc.sysinit
-> runs the rest of /etc/inittab
-> inittab contains default runlevel: init runs all processes 
   for that runlevel /etc/rc.d/rcN.d/ , 
-> runs /etc/rc.d/rc.local

Usually, from inittab, init executes /etc/rc.d/rc.sysinit in new subshell.


============================================================================
23. Some remarks on how to restart daemons on a running system:
============================================================================

In order see the status of all scripts and services in your system, use

# chkconfig   
# chkconfig  --list  
# service --status-all
 
or to see the status of just one:

# chkconfig --list <svc name> 
# chkconfig | grep <svc name> 
# ps -ef | grep -i <svc name> 
# service --status-all | grep <svc name> 
 

If you need to stop, or start, or restart a service or daemon on Linux,
the following examples might help.

Examples:

=> Red Hat and friends:

# service nfs restart			
# service nfs start			
# service nfs stop			
	
Or, this works too:

# service network restart                # command similar to above

#/etc/init.d/network restart             # equivalent
	
=> On many other Linux distros:
			
# /etc/init.d/nfs start 			
# /etc/init.d/nfs stop			
# /etc/init.d/nfs restart

# /etc/init.d/networking restart
# /etc/init.d/networking start
# /etc/init.d/networking stop


============================================================================
24. Some SAN and SCSI talk:
============================================================================


24.1 Some SCSI terminology, and how it's called in Linux:
---------------------------------------------------------

It does not matter too much whether scsi commands are "encapsulated" in frames 
like for example used with an FC infastructure (using switches/directors), or 
with FCoE or iSCSI SAN, compared to the traditional local SCSI adapter:  
much terminology is the same.

Lets take a look at a "traditional" locally installed SCSI adapter (HBA).

- An adapter might have one or more "channels" or SCSI busses.
  So, in the case of multiple channels, each channel is it's own individual
  SCSI bus. See figure below.

- Each SCSI bus can have multiple SCSI devices connected to it.

 - In narrow SCSI, we can have up to 8 SCSI devices, each identified by 
   it's unique SCSI ID (0-7), where traditionally, the HBA takes ID 7.   
 - In SCSI wide, we can have up to 16 SCSI devices, again each identified 
   by it's unique SCSI ID (0-15). 

To illustrate this a bit, see the figure below.


---------------- scsi id7
 ADAPTER or     |         channel/bus
 CONTROLLER     ||----------------------------|--------------------|----------
                |                         ---------            ---------
                |                        [scsi id 2]          [scsi id 3]
                |                         ---------            ---------
                |                             |--lun0              |--lun0
                |         channel/bus         |--lun1              |--lun1
                ||-------------                                    |--lun3
                |
----------------

Now, suppose we have a SCSI device on the bus, for example with SCSI ID 2,  
which happens to be a CD Tower, using multiple CD Drives.
 
Luckely, subadressing exists, so that the individual drives of this tower
can be accessed. The devices, which reside under a certain SCSI ID, are called
"Logical Units", identified by their "Logical Unit Number" or "LUN".

Now, in this example we used a CD Tower, but it also can be some diskarray.

To get to some LUN, the following "path" or full adress must be used.
The list below shows the standard SCSI talk, and Linux talk.

SCSI:               Linux terminology:
SCSI adapter number [host]
channel number      [bus]
scsi id number      [target]
lun                 [lun]

=> So, in SCSI language, a "path" to a LUN would be: 
   scsi_adapter, channel, scsi id, lun

=> Using the naming conventions of Linux, this becomes: 
   host, bus, target, lun

Later, when we use commands like "lsscsi" or take a look in
"/proc/scsi/scsi"
we will see output like [4:0:1:0] which is exactly the same as

[host,bus,target,lun], 

or in scsi terms, 

[scsi adapter#, channel#, scsi id, lun#]


Some remarks about "Initiator" and "Target":

Suppose we have a PC using a SCSI card, where on one bus some SCSI disks are present.
Now, suppose some application does a systemcall for some "file open".
The OS will handle that, and ultimately, a driver takes care for the details.
Anyway, the SCSI Card gets request from the driver.

When we then consider then the processes on the SCSI bus, the SCSI card then acts as 
a controller called an "initiater".
This one usually starts "the conversation" using SCSI commands. The "target" then, 
is one of the storage devices (like a SCSI disk) on the bus.

When communication is performed between controllers and targets 
(and thus involving disks), typically the elements in transfer are 
"block address spaces" and "datablocks". That's why
people often talk about "block I/O services" when discussing SAN's.


24.2 LUNs on SANS:
------------------

A socalled traditional LUN on a SAN, might come into being like this:

The Storage Admin selects a couple of disks and create a RAID volume
from those disks. At this point, a "Logical Unit" might be created.
Then, the managing software for that SAN, will associate a "number"
for that "Logical Unit", called a LUN.

Please note that the "Logical Unit" might be seen as usable diskspace,
once a "client" is able to "see" it. 

The LUN identifies this LU in the storage system.

After some additional actions and zoning, the LUN in principle can
be "seen" from the authorized client/initiator on the channel.

However, on the client side, often some re-enummeration has to be done
in order to see the new LUN as a disk.

-Sometimes it's a facility of the OS, which can be called at any time.
 People then often say that "scanning the busses" needs to be performed.
-Sometimes a special signal needs to be send to some module in the
 driver software stack.
-Unfortunately, in some rare cases, only a reboot will help, since 
 in some case, only then NVRAM or BIOS routines will rescan the busses.


24.3 Initiatives to (try to) unify driver stacks to access SANs:
----------------------------------------------------------------

Multiple SAN Vendors exists. Although the protocols and interfaces are
quite well defined, the supported software (like drivers) for Linux
is another matter. The risk exists that the Linux community would face a jungle
of drivers, ways to implement software, and nummerous Vendor-tied issues.

Initiative was taken to try to structure the ways to setup and maintain
the needed software. Two main initiatives are:

=> STGT/TGT: Linux SCSI target framework
   TGT tries to simplify various SCSI target driver creation and maintenance,
   using iSCSI, Fibre Channel, SRP, and others.
   The framework encompass kernel-space and user-space code. The idea is that
   newer Linux kernels (as of 2.6.20) would/should be equipped with the 
   supporting kernelcode, so that only userspace code needed to be installed.  

=> LIO Target 
   LIO Target is another multiprotocol SCSI target for Linux. This one too 
   supports all modern protocols like iSCSI, Fibre Channel, InfiniBand (SRP), 
   and a few other architectures. 
   The same idea as with TGP applies here too.


LIO went "upstream" into Linux with kernel version 2.6.38,
and has become the standard unified block storage target in Linux.
So, it seems that the Linux Community has favored the the LIO framework
a bit, thereby not excluding any other target framework ofcourse.
However, TGT went upstream as of of 2.6.20. So, users can choose which
frameworks serves them best for a certain situation.

However it's not all "whiskey and sunshine" here. Some distributions seems to 
have their own worries. 

Anyway, there are packages available for LIO, STGT and others like IET.


============================================================================
25. Some notes on Backup/Restore:
============================================================================

A few notes on Disaster Recovery (DR) for Linux.
In this context, we mean a proper way to recover the OS.

 
25.1 "Simple" backups are easy, but a DR solution is not:
---------------------------------------------------------
 
Ofcourse, a number of "archiving" or backup tools are standardly available in Linux, 
like tar, cpio, dd and a number of others.
So, using these tools, you can "backup" a file, or a number of files,
or a directory, or a whole directorytree, to another location.
 
So, suppose you have a lot of stuff in, say, "/apps", it is possible to create 
a tar file (containing the whole of /apps) on a backuplocation, for example
some NFS mount, or tape, or just another filesystem on your machine.
 
Example:
 
# tar -cvf /backups/backup_apps.tar /apps
 
Here, we create the tar (backup) file "backup_apps.tar" on nfs mount "/backups",
and it contains the whole of "/apps".
 
Or, a bit smarter, to compress the backupfile as well:
 
# tar czf /backups/backup_apps.tar.gz /apps

Note:
If active database files are alive in some subdirectory of /apps,
then they will not be backupped in a usable state.
Creating backups using the standard tools (tar, cpio etc..) expects
static or "cold" files.

 
With respect to using the standard tools like tar, cpio, etc.., they will NOT enable 
you to create a proper "Disaster Recovery" solution for recovery of your whole 
OS environment. However, if you are familiar with "dd", you can. 

Although you can backup directories, or whole filesystems (and raw partitions),
as a part of disaster recovery, you need to backup the root filesystem as well, 
and a way to boot from media, so you need "something" that holds your MBR and 
other bootareas as well.
 
Also, even large third-party suites like TSM and many others, will help
in creating good up-to-date backups of filesystems, but usually will not be of help
in creating a product that immediately is usable when the bootdisk is
corrupt.
 
But these really are "bare metal" problems. If you run VM's under Xen or VMWare, DR 
solutions are around the corner.
 

25.2 Using VM's under a Virtualization Product (like ESX): DR is relatively easy:
---------------------------------------------------------------------------------
 
If you would run Linux (or Windows) Virtual Machines under ESX, it's relatively easy to 
backup such a VM (the whole system).
In a ESX Infrastructure, you can create a "snapshot" of a VM, which
means you got the systemdisk in a file, which you can easily import again
in case the "live" system goes bad.
 
Also, since a VM is basically a .vmdf file in some datastore, it easy to
copy it to a backup location.
 
Here you have all the needed filesystems like "/" and "/boot", just stored "magically" in 
that vmdk file, while the ESX Host environment takes care  of the conditions under which 
the VM will boot.
 
So, usually a good DR compliant solution is available under virtualization.
 

25.3 Using Bare Metal:
----------------------
 
On a standard Linux distribution on a physical machine (bare metal), you can create 
backups of files, directories, and filesystems, but you cannot easily create a "single 
component" DR solution with the standard tools alone.
 
Usually, you can easily backup data directories (like /apps),
but making a good image of the Operating System is often not simple.
 
Here we mean: you have a physical machine. Then, your bootdisk goes bad.
Now, you want to apply a "one component" solution, by which we mean restoring  
MBR, grub, filesystems etc.. In short: the whole lot.
 
There exists commercial, and Open Source tools, which can help you out. In the Open 
Source realm, you might take a look at the features of:
 
- Mondorescue
- Rear: Relax and Recover
 
Both are able to backup a complete bootable system to a local disk
like CDR, DVD and the like, or to a network mount from NFS.
 
Often, the commandline operation of those tools are quite
complicated, but I think it's worth the effort if your IT operations use
important bare metal machines. 

The charm of these tools is, that you can create usable OS images on NFS mounts.
So if you have tens of bare metal machines, these tools are certainly of interest.

If you are only interested in backupping the OS of your PC, or a few machines,
to CDR/DVD or USB stick, and to be able to boot from these media, 
much simpler solutions exists. Thousends of good articles can be found on
the Internet.


25.4 Document your Servers:
---------------------------

Whether you use a virtualized environment or not, having 
technical documentation about each server is very important.

For your important servers, why don't you save the output
of some important commands to a .txt file, like for example:

cat /proc/scsi/scsi                >> /home/admin/serverdoc.txt
ls -al /sys/class/scsi_host        >> /home/admin/serverdoc.txt
df -h                              >> /home/admin/serverdoc.txt
cat /etc/fstab                     >> /home/admin/serverdoc.txt
cat /etc/exports                   >> /home/admin/serverdoc.txt
raw -qa                            >> /home/admin/serverdoc.txt
fdisk -l                           >> /home/admin/serverdoc.txt
cat /proc/partitions               >> /home/admin/serverdoc.txt
cat /etc/inittab                   >> /home/admin/serverdoc.txt
ls -al /boot                       >> /home/admin/serverdoc.txt
cat /boot/grub/grub.conf           >> /home/admin/serverdoc.txt 
cat /etc/*release                  >> /home/admin/serverdoc.txt
uname -a                           >> /home/admin/serverdoc.txt
cat /etc/group                     >> /home/admin/serverdoc.txt
cat /etc/passwd                    >> /home/admin/serverdoc.txt
ifconfig                           >> /home/admin/serverdoc.txt
cat /etc/resolv.conf               >> /home/admin/serverdoc.txt
crontab -l                         >> /home/admin/serverdoc.txt
lsmod                              >> /home/admin/serverdoc.txt
etc..

I am sure you can "improve" this example easily !

This way, you have important config info of a server, in just one
simple txt file.

Then you could also script and schedule it. Then, from all your servers,
you can collect those files into some repository.
This then means you have pretty good technical documentation of
all your Linux machines.


============================================================================
26. Recovering the root password:
============================================================================

Suppose you don't know, or forgot, the root password of your Linux installation.
Maybe the following will help:


1. Trivial option: maybe by using sudo?
---------------------------------------

If sudo is implemented, and your account is authorized, you may try:

$ sudo su -

Sudo then asks for YOUR password.
Next, you are entering root status, where you can change the root password.

# passwd

OK, very unlikely that it works, but it's worth a try if you knew that
sudo was implemented.


2. Booting to single user mode:
-------------------------------

As you might recall from Chapter 11, the "inittab" file defines the possible
"runlevels" the system can enter at boottime.

Usually, runlevel 3 is the default, meaning regular multi-user access.
Runlevel 1 is the socalled "single user" mode, especially meant for
specific maintenance purposes, where useraccess is not desired.
"Single user mode" is sometimes also denoted by "emergency mode".

When the system boots to single user mode, it should boot to the "#" prompt,
from where you can change the password.

On some systems, Grub will present "single user mode" as one of it's
menu options.

On other systems, you need to perform some actions when Grub has presented
it's bootmenu. Usually, it goes like this. However there is no guarantee
that it works for your specific system.

When Grub presents it's menu:

1. go to the entry you want to modify (the Linux option)
2. use the "e" key to edit this entry
3. find the kernel line
4. at the end, add:   single     or     single init=/bin/bash (see note)
5. use Esc to go back
6. press "b" to boot to single user mode
7. after a short time, hopefully a root shell appears.
8. Here you can use the passwd command

Note: 

If the shell wants you to login, you might try to append
single init=/bin/bash on the kernel line at step 4 instead 
of just 'single'.


3. Using a Rescue or Live CD/DVD:
---------------------------------

Boot up from CD/DVD.

Suppose that your harddisk based Linux has the root filesystem on /dev/sda2.
Suppose that the booted live DVD has a tempfs on /tmp.

1. Mount the (harddisk) root partition in a directory, for example: 

# cd /tmp
# mkdir mnt
# mount /dev/sda2 /tmp/mnt
 
2. Bind the current /dev with the would-be root: 

# mount -�bind /dev /tmp/mnt/dev     # eiher 2 or 1 "-"
 
3. changing the root file system: 

# chroot /mnt/tmp /bin/bash

Now you can use:

# passwd root

And reboot the system, this time from harddisk.


============================================================================
27. A few notes on checking/installing driver or modules of SAN HBA cards:
============================================================================

* IMPORTANT:                                                              *
* Here you will find some general information only.                       *
* It does not contain exact instructions on how to connect your system to *
* SAN LUNs and how to configure them in a correct way.                    *


In the former chapters, we have seen some commands to see which kernelmodules
(lsmod, modprobe) are present, and what hardware is installed (lspci, lshw, lsscsi).
Ofcourse you can use those commands to check out what SAN FC HBA card(s) are
in your system, and what kernel modules are loaded, and what device info
can be found in "sysfs" (/sys) and "procfs" (/proc).

Note that with access to SANs, three main techniques are used:

- (Traditional) Fiber Channel infrastructure, where fiber is used, 
  and switches/directors connects hosts to SAN storage.

- Fiber channel over Ethernet (FCoE), where FC is encapsulated in network
  packets. In effect, a network is used.

- iSCSI, which is essentially SCSI over an true IP network infrastructure.

Most what you will see implemented are the traditional FC fiber infrastructures
and iSCSI networks.

There is a difference between FCoE and iSCSI, although in both cases
a network is used.  In FCoE, the true FC protocol stack architecture 
is carried by the Layer 2 network (Ethernet). 
With iSCI, just SCSI commands are encapsulated in an IP network.

With a (Traditional) Fiber Channel infrastructure, each HBA has a 
unique "World Wide Name" (WWN), which is similar to an Ethernet MAC address,
so that this card can be uniquely defined. WWNs are 8 bytes strings. 

There are two types of WWNs on a HBA; a node WWN (WWNN), which can be shared 
all ports of a device, and a port WWN (WWPN), which is neccessary to
uniquely identify each port.


27.1 Check to see if Fibre FC card and driver is installed:
-----------------------------------------------------------

--> Check for kernel modules and hardware

# lsmod | grep -i lpfc         # Check if Emulex adapter driver module is loaded 
# lsmod | grep -i qla          # or Qlogic
# lsmod | grep -i scsi         # or just see what SCSI related modules are there 


The adapter-type is often either qlaxxxx for QLogic adapters 
or lpfc for Emulex adapters. But many others are used as well.

# lspci | grep -i fibre        # grepping on keyword "fibre"
# lspci | grep -i emulex       # since Qlogic/Emulex is often used,
                               # you might try that as well

# lspci | grep -i qla

# dmesg | grep -i fibre        # recall that dmesg (among other things),
# dmesg | grep -i emulex       # can be used to view kernel boot messages
                               # also, after boot, diagnostic messages can be viewed


--> using /proc (procfs):

# ls -al /proc/scsi/
# cat /proc/scsi/scsi          # To display the SCSI devices currently 
                               # attached (and recognized) by the SCSI subsystem
# ls -al /proc/scsi/lpfc


--> using /sys (sysfs)

Recall from chapter 24, how the hierarchy of hosts, busses, targets and luns
are used.

SCSI:               Linux terminology:
SCSI adapter number [host]
channel number      [bus]
scsi id number      [target]
lun                 [lun]

# ls -al /sys/class/scsi_disk

Might show you luns in the form of paths [host#:bus#:target#:lun#]

# ls -al /sys/class/fc_host/

Might show you dirs like host0, host1 etc..

# ls -al /sys/class/scsi_host

Will show you hostN's too (N=0,1 etc..)

=> Why is that? Why two trees?

   An FC port can based on a the "true" physical port. But the newer
   FC protocols allow also for socalled virtual ports.
   Using the socalled "N_Port Id   Virtualization (NPIV) mechanism, 
   point-to-point connection to a "Fabric" can be assigned more 
   than 1 N_Port_ID.

   So, usually, the driver will create a new scsi_host instance
   on the vport, resulting in a unique <H,C,T,L> namespace for 
   the vport. The result is that in all cases, whether a FC port 
   is based on a physical port OR on a virtual port,
   each will appear as a unique "scsi_host" with its own 
   target and lun space.


# ls -al /sys/class/fc_transport

Might show you dirs like target1:0:0, where 1 is host, 0 is the bus, 
and 0 is the target id.

Often, the WWN can be found using:

(1):

# cat /sys/class/scsi_host/host1/device/fc_host:host1/port_name

Here, in this example, we used "host1".

(2):

Or look in the "/proc/scsi/adapter_type/n" directory, where adapter_type 
is the host adapter type and n is the host adapter number for your card. 

--> Finding WWPN of a device, for example of "sdc":

# scsi_id -g -u -s /block/sdc


Note:
-----

-> For 2.4 kernels, you can find the driver version in the 
  "/proc/scsi/lpfc/n" directory, where n is the host adapter port 
  that is represented as an SCSI host bus in Linux.

-> For 2.6 kernels and higher, because of the newer to sysfs, 
   the driver version might not be available in the /proc/scsi/lpfc/n 
   directory. If so, go to the /proc/scsi/lpfc directory and inspect the values. 
   Use "cat /sys/class/scsi_host/hostn/lpfc_drvr_version", where n is each 
   of the values recorded from the /proc/scsi/lpfc directory.


27.2 Installing an Fibre FC card:
---------------------------------

The following is ofcourse not an installation manual for FC drivers, 
but it serves to gives us an idea of a typical setup.

Install the adapter card. If the card is not connected directly to the storage unit, 
or to a fabric switch port, install a "loop-back connector". 
This "loop-back connector" might be supplied with the adapter card.
Next, reboot the server.

Now, maybe you need to install the driver software... or maybe not !
Recall from section 24.3, that uniform driver stacks might already
be present on your system.

However, maybe you do indeed still need to install a driver kit.
Say that we are installing an Emulex lpfc compatible driver. A typical session
might go like this:

- Download the driver kit from the Emulex Web site or copy it to the system 
  from the installation CD.
- Unpack the tarball with the following command:
  tar xzf lpfc_2.6_driver_kit-<driver version>.tar.gz
- Change to the directory that is extracted:
  cd lpfc_2.6_driver_kit-<driver version>
- Execute the 'lpfc-install' script (with no switches) to install the 
  new driver kit. Use:
  # ./lpfc-install

Ofcourse, every type of HBA will have it's own installation method.
So the above is just an example.

Such a script might compile .ko modules, put the stuff in the right 
directories, and then load the driver using "modprobe".

Now if the SAN "exports" LUN's on a channel connected to your HBA,
you should be able to see the devices.
However, often a "re-scan" of the bus is neccessary.
See also section 21.5.


27.3 The mapping of device names to scsi devices:
-------------------------------------------------

So, we have the scsi disk device files like "/dev/sdb", and we can see stuff
in "/proc/scsi/scsi", but how does that relate to each other?

I mean, if you use "fdisk" or similar program, you think in terms of "/dev/sdb",
and so does the kernel.

So, there has to be some sort of "mapping" between what the kernel thinks,
and to what the "scsi subssytem" thinks is the state of affairs.

=> If we use a command like this:

# lsscsi -l

we might see records like:

[4:0:1:0]    disk    IBM      2145             0000  /dev/sdb
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
..
..

=> If we now look in /proc/scsi/scsi, we see:

# cat /proc/scsi/scsi
..
Host: scsi4 Channel: 00 Id: 01 Lun: 00
Vendor: IBM Model: 2145 Rev: 0000
..

Then we see more or less the same info. From the "lsscsi" command alone,
we can that the path to the LUN:

[4:0:1:0], actually corresponds to "/dev/sdb".

The SCSI subsystem like to think in fully qualified addresses,
like [4:0:1:0], which actually defines a "path" (so to speak) to a LUN,
so that it can easily address that LUN.

Allright, we have been able to relate the usual disk device names to
the identifiers we can see in /proc/scsi and in several places in /sys.

Note that there is no relationship between SCSI devices and partitions.
So, your system may get a LUN from a SAN, like [a:b:c:d] which gets
a device name like sxy.


27.4 Modifying the initial ram-disk ("initrd" or "initramfs"):
--------------------------------------------------------------

In Chapter 12, we have spend a few words on the Linux bootprocess.
At a certain stage, "initrd" expose a mini filesystem so that the kernel
can obtain all needed modules and mount all "real" filesystems.

If you have a Linux driver that does not automatically configure 
any LUNs other than LUN 0, then we need to let the system to detect 
the LUNs automatically when the system is started,
by modifying the initial ram-disk (initrd).

If only LUN 0's are detected, we need to perform the step as
described later in this section.

Note:

Other reasons for initrd (or initramfs):

We cannot expect the kernel to know about all the possible hardware
and disk devices in the world.
So, if filesystems are on specific diskdevices, it needs to load
the neccessary kernel modules first. Thats why in general:

- Or you use a kernel with pre-enabled support for all devices
  connected to your system.
  So, this is "compiled-in" support for a certain hardware driver. 

- Or you use an initrd preliminary root file system image, where
  the kernel can find what it needs.
  Here the kernel uses loadable modules for supporting all devices.

So if you want the kernel to use specific hardware (e.g. SCSI HBA),
you can either "put" (compile) the driver into the kernel, or you make
it easy for the kernel to find the appropriate modules.  


So, if not all SCSI devices are available at boot, and we want the
kernel to find everything using an initrd ram disk:

1. Configure SCSI mid-layer:

Usually, we should start with instructions for the SCSI mid-layer driver 
(that controls how many LUNs are scanned during a SCSI bus scan), that
we need it to scan for more.

- Open the /etc/modules.conf file.

- For Linux 2.4 kernels, add the following line:
  options scsi_mod max_scsi_luns=n
 
- For Linux 2.6 kernels, add the following line:
  options scsi_mod max_luns=n
 
n is the total number of LUNs to probe, like for example 64, 128.


2. Rebuild the ram-disk for the current configuration:

You probably do not need this if you build your scsi drivers 
right into the kernel, instead of into modules.
Otherwise, we need to update the ram disk.

To rebuild the ram-disk associated with the current kernel, use the
appropiate command for your specific operating system.

For example on RedHat:

# cd /boot
# mkinitrd �v initrd-kernel.img kernel

where "kernel" is the string you see using "uname -r".
So, on RedHat and similar systems, we can use the "mkinitrd" command.

On other distributions, similar commands are available like "mkinitramfs",
which uses a slightly different approach (using a cpio archive).

>> IMPORTANT ! <<

If you want to use mkinitrd, you need more information, and use
it a test system first. (!!!)


============================================================================
28. Some special filesystems:
============================================================================


28.1 The "tmpfs" filesystem, and "/dev/shm":
--------------------------------------------

The "tmpfs" filesystem is actually backed by "virtual memory", and often
you can see a mount on "/dev/shm" of type tmpfs. 

However, on your particular system, do not be very surprised if you do not 
see "/dev/shm" as a mountpoint.

Since it's memory based, after a reboot it is cleared.

It was designed for programs to communicate using shared memory, and
for increasing performance if programs store temporary files in this filesystem.
Some folks say that the "tmpfs" filesystem type, when mounted on a certain 
mountpoint, is just like a "ram disk".

There can be some confusion if you compare it to the familiar "/tmp" mount.
Usually, "/tmp" is diskbased, but sometimes it's memory based as well

For "/tmp", there can be several implementations:

- it can be of type "tmpfs" as well, so it's memory based.
- it can be just a directory within "/", so it's disk based.
- it can be a seperate partition, mounted on "/tmp", so it's disk based as well.

Use "df -h" and take a look at fstab ("cat /etc/fstab") to find out how it
is implemented on your system. 

So, in general, 
- there exists a filesystem "type" tmpfs (memory based).
- you might have a mountpoint "/dev/shm", which is memory based.
- you have a /tmp mountpoint, which might be diskbased, or memory based.


Really, it's not "vaque" or something. Just keep in mind that not all 
distributions and versions take the same approach.


28.2 The /proc and /sys pseudo filesystems:
-------------------------------------------

The pseudo or virtual "/proc" filesystem on a running system, 
can be seen as a sort of "window" to view kernel data structures.
Here, subdirectories exists for all running processes, as well as for system
resources, that is, the values of swap, memory, disks, cpu etc..

In most cases, consider it to be as "read only". However, in some cases
you can use it to send information to the kernel as well.


Also, whenever you hear of a "virtual filesystem", it means that
it's memory based, build when the system boots, and maintained during runtime.

In a sense, a newer, more structured version of proc is available (since 
kernel 2.6), which is called "sysfs". This too is a virtual filesystem, and it 
sort of exports the "device tree", and system information, through the use 
of such a virtual filesystem. You can see it by browsing through "/sys".

You might say that "/proc" is more focussed on processes, while "/sys" is
a new way to obtain device- and system information.


28.3 The "/dev/mapper" device mapper :
--------------------------------------

In general, your system might have disk access through:

- Directly Attached Storage (DAS), which could be some internal disk,
  or even a disk array, directly attached on a local SCSI HBA card.
- SAN, for which a couple of variations exists like FC, iSCSI
- NAS, meaning using real "file based IO" (instead of block IO), 
  like a Network Attached Storage device.

When you consider DAS, or SAN LUNs, you can treat the storage
in the "traditional" way, or you make use of a LVM.

=> Not using Logical Volume Management (LVM):

Traditionally, local harddisks are divided into partitions. 
Next, a "filesystem", like ext3, is written directly on a partition. 
This is how you typically would use Linux, on some simple PC system.

- Now, with the traditional methods, you cannot, for example, add two disks
  "together" to form  some sort of larger contiguous volume, which you logically
  can use a one "disk".

- Also, using the traditional methods, you cannot create redudant information
  for high availability purposes, like RAID 1 (mirrorring of a disk or partition),
  or RAID 5 and other RAID implementations.


=> Using an LVM:

In Linux, LVM is often implemented using "LVM2", or EVMS, or Veritas LVM, 
or some other LVM.

In LVM terminology, physical disks are called PV's (Physical Volumes).

The key point is that you create one or more "Volume Groups" (VG) from the
available PV's.
Each VG is thus made up of a pool of Physical Volumes (PVs). You can extend 
(or reduce) the size of a VG, by adding or removing a PV, as desired.

Once a VG is in place, you carve out (using LVM commands) a Logical Volume (LV)
on which you place a filesystem. 
Note that an LV can span multiple disks (PV's). A LVM also provide means for
redundancy, like creating a mirrorred LV, which makes the system much more robust.

As another plus, its easy to increase the size of an existing LV, so if a filesystem
like /apps is (under the hood) actually some LV, it's easy to increase the size 
of /apps, as long as the Volume Group has space available for that purpose.

Here is a very simple session. Suppose you have three extra disks, like
/dev/sda, /dev/sdb, and /dev/sdc.

Don't forget: below are generic commands. On your system, they might take
a slightly different form. But it should give you a reasonable idea on how
we act in a typical LVM environment.


-> Step 1: formally add the disks to the LVM as usable PV's:

# pvcreate /dev/sda /dev/sdb /dev/sdc

So, now the LVM "knows" these disks are available as PV's.


-> Step 2: create a Volume group from the PV's

# vgcreate datavg /dev/sda1 /dev/sdb1 /dev/sdc1 

Here we have called the VG "datavg" and used all 3 disks.


-> Step 3: create one or more LV's in "datavg":

# lvcreate -L 500G -n oraclelv datavg

Here we have called the Logical Volume "oraclelv".

Next, we can create a filesystem on "oraclelv", using known methods:

# mkfs -t ext3 -v /dev/datavg/oraclelv


So where does the "device mapper" comes in?

The Device-mapper is a standard component of the 2.6 (or higher) linux kernel,
which supports logical volume management in a "more natural way". 

It keeps track between the physical devices, and the "logical entities"
used in LVM, like logical volumes.
Also, it manages the realtion between the physical device files (like /dev/sdb) 
and the entities found in "/proc/scsi/scsi" and "/sys/class/scsi_disk".

Note:

With multipath SAN connections, the device mapper is even more "notable".

Here is an example of the "df -h" output of a machine using SAN luns.
Note the "/dev/mapper/" part in the devicenames.

[root@zigzag tmp]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p7     3.9G  715M  3.0G  19% /
/dev/cciss/c0d0p6     7.8G  1.4G  6.0G  19% /usr
/dev/cciss/c0d0p5     7.8G  3.8G  3.7G  51% /home
/dev/cciss/c0d0p3     7.8G  788M  6.6G  11% /var
/dev/cciss/c0d0p8      85G   32G   49G  40% /apps
/dev/cciss/c0d0p1     494M   17M  452M   4% /boot
tmpfs                  31G  176M   31G   1% /dev/shm
/dev/mapper/ocfs2backupp1
                      200G   44G  157G  22% /ocfs2_backup
/dev/mapper/ocfs2appp1
                       30G  2.5G   28G   9% /ocfs2_app


We will see some more of this in section 30, where touch on the implementation
of "multipath" SAN storage connections, but, on RedHat specifically.


============================================================================
29. A few typical examples of partitioning and creating filesystems:
============================================================================

On earlier occasions in this note, we already have seen how to use 
(for example) fdisk, to partition a disk, and next how to create a filesystem on 
that new partition.
Let's again walk through a few simple examples.


Example 1. A simple session on a simple local disk (not using LVM):
-------------------------------------------------------------------

Suppose you have a new local disk, with the device file "/dev/sda".
A simple session looks like this:

# fdisk /dev/sda                      # first partition the disk. Fdisk will ask
                                      # a few questions like if you want a Primary
                                      # or Extended partition, the ending track etc..

# mkfs.ext3 -b 4096 /dev/sda1         # create a filesystem of type ext3 on sda1

# mkdir /data                         # create a dir that will serve as a 
                                      # mountpoint
 
# mount -t ext3 /dev/sda1 /data       # mount it, so that it becomes available

If you want the mount to be available after the next boot, then edit /etc/fstab
and add a record resembling this:

/dev/sda1     /data     ext3    defaults   1 1


On your distribution, you might have several "mkfs-like" commands available, which 
basically all do the same thing. It's just that some are shells "over" other 
commands, just for the purpose to make it easier for us.
So, it's likely that your distribution has a script named "/sbin/mkfs.ext3" 
(which we used above). 
Above, we could also have just used the "mkfs -t ext3" command.

Here are a few other examples:

# mke2fs /dev/sda1            # make an ext2 filesystem on sda1

# mke2fs �j /dev/sda1         # make an ext3 filesystem, 
                              # due to the -j (journaling) switch

# mke2fs -t ext4 /dev/sda1    # make an ext4 filesystem

Exercise:
---------

If you are not too well aware about the different capabilities and properties
of the ext2, ext3, and ext4 filesystems, then use a search machine and find
more information. Especially, take notice of the max file sizes, 
max Partition size, journaling options, and whatever else you might 
find interresting.


Example 2. Creating a filesystem on a Logical Volume (using a LVM):
-------------------------------------------------------------------

So, how do we go about if using a LVM?

Suppose you have a SCSI controller, which controls a diskarry on
your system. So, here we say that we have DAS, or Directly Attached Storage.

But, what we will see below, holds for SAN Storage too.
However, in case of SAN storage, its wise to get more details about the
implementation of LUNs in the SAN, and if additional features are 
in use like "multipath IO".

In section 28.3, we outlined the creation of a Volume Group (VG), from
Physical Volumes (PVs). Please take a look at that section again.
Indeed, here we already saw an example of creating a filesystem 
on a Logical Volume (LV).

Here is another example, using mke2fs for creating the filesystem:

# vgcreate datavg /dev/sda1 /dev/sdb1 /dev/sdc1 

# lvcreate -L 500G -n oraclelv datavg   # oraclelv is the LV
                                        # within the datavg Volume Group

# mkdir /oracle                         # create the directory/mountpoint 

# mke2fs -j /dev/datavg/oraclelv        # create an ext3 filesystem

# mount /dev/datavg/oraclelv /oracle    # mount it to make it usable


Note that "partitioning" is implemented in the creation of Logical Volumes.
So, here we do not use a tool like "fdisk" or "parted" anymore.


Note:
-----

Ofcourse, the wellknown "fdisk" utility, does it's work very well.

However, if your system has block devices that are greater than 2T, 
you should the "parted" command to create (or remove) partitions.

Parted is quite an extensive utility, with a number of subcommands like
"cp", "mkpart" etc..

Example session:

# parted

GNU Parted 2.3
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.

(parted) select /dev/sda
(parted) print

... shows all partitions of the selected disk sda

(parted) select /dev/sdc             # let's partition /dev/sdc
(parted) unit TB
(parted) mkpart primary 0.00TB 5.00TB
(parted) print

... shows all partitions of the selected disk sdc

It really deserves to have a manual for itself, and indeed it has.
You might like to take a look at:

http://www.gnu.org/software/parted/manual/html_mono/parted.html


============================================================================
30. A few notes on implementing multipath IO to a FC SAN:
============================================================================

This section relies on all the theory, and command usage, as we have seen
already in earlier chapters.


30.1 See your FC cards and drivers:
-----------------------------------

-> FC Cards:

For example, to see your (FC) HBA cards, we can use:

# ls -al /sys/class/scsi_host

or specifically for FC:

# ls -al /sys/class/fc_host/

drwxr-xr-x  3 root root 0 Jan 13  2012 host0
drwxr-xr-x  3 root root 0 Jan 13  2012 host1

So, in this example we have two FC Cards, or HBA's, which are connected to a 
SAN Fabric or switch.

-> Drivers:

Use commands like:

# lsmod | grep -i scsi
# lsmod | grep -i lpfc           # Emulex
# lsmod | grep -i qla            # Qlogic

Now suppose we have Qlogic HBA cards, then we might see output like:

qla2xxx              1133797  32
scsi_transport_fc      73800  1 qla2xxx
scsi_mod              196697  7 scsi_dh,sg,qla2xxx,scsi_transport_fc,libata,
                                cciss,sd_mod

Indeed, the kernel modules are thus active.


30.2 Path to a SAN LUN:
-----------------------

Also, recall that a "path" to an exposed LUN (disk) in the SAN, is expressed like:

=> In SCSI language: 
   scsi_adapter#, channel#, scsi id, lun#

=> Using the naming conventions of Linux, this becomes: 
   host#, bus#, target#, lun#

Many commands will show LUN paths like for example [2:0:1:0] in which we can reckognize
the [host,bus,target,lun] notation.


30.3 Multipath if using 2 FC cards:
-----------------------------------

For HA reasons, a Server often has two FC cards. This is called "multipath".
There are two main modes:

- failover: one FC card is active, the other one is idle. In case of a problem,
  the SAN connection will be taken over by the former idle card.

- aggregated: both cards are active at the same time, and possibly some loadbalancing
  procedure is in place.

In it's most basic form, the setup resembles this:

     -----------------------
     | Server              |  hba1= host0
     |                     |  hba2= host2
     |   [hba1]    [hba2]  |
     -------|---------|-----
            |         |
            |         |
     -----------------------
     |      |         |    | SWITCH
     |      -----------    |  
     |      |         |    |
     -----------------------
            |         |
           [|]       [|]
     -----------------------
  SAN|      |         |   |  
     |      |         |    |  
     |      |-------[lun]  |
     -----------------|-----

You see the 2 (or 4) paths to that LUN? (depending on the exact implementation).


30.4 How many physical devices are shown:
-----------------------------------------

Suppose the SAN Admin has done all actions needed, to expose a LUN to our Server.
In the setup above, at least two paths to the same LUN exists, but usually
in such a setup, 4 possible paths exists.

Now, suppose we formerly had only internal disks like hba and hbb.

So, when Linux has performed it's scanning of all busses, we might see the
following new devices:

# ls �al /dev/sd*

brw-r-----  1 root disk   8,     0 Oct  3 17:21 sda
brw-r-----  1 root disk   8,    16 Oct  3 17:21 sdb
brw-r-----  1 root disk   8,    32 Oct  3 17:23 sdc
brw-r-----  1 root disk   8,    48 Oct  3 17:23 sdd

This is what Linux "thinks" is going on. It can access a diskdevice along
4 paths, so it created 4 device files.
We know that in reality, it is just the same diskdevice.

Now, let's take a look at this:

#ls -al /sys/class/scsi_disk

total 0
drwxr-xr-x  6 root root 0 Oct  3 17:23 .
drwxr-xr-x 42 root root 0 Oct  3 17:21 ..
drwxr-xr-x  2 root root 0 Oct  3 17:21 0:0:0:0
drwxr-xr-x  2 root root 0 Oct  3 17:21 0:0:1:0
drwxr-xr-x  2 root root 0 Oct  3 17:23 1:0:0:0
drwxr-xr-x  2 root root 0 Oct  3 17:23 1:0:1:0

Here, 4 "LUNs" are shown, due to the fact that 4 paths are available.
If you "read" those paths, you can see that LUN0 can be reached 
via host0, that is, "[0:", and via host1, that is "[1:".

Next, we will discuss what we need to do using a specific distribution,
namely RedHat. With other distributions, a similar approach is followed.


30.5 Installing DM-Multipath software:
--------------------------------------

Having drivers and connections, is not enough. We need a specific
"multipath kernel module", and a "service" which monitors the HBA's
and all paths. In case of failure, the kernelmodule will switch IO
to the idle card.

Installing and configuring the software means the following:

- install:

# rpm -ivh device-mapper-multipath.rpm

- configure:

Edit the configuration file "/etc/multipath.conf" to configure
which devices with a certain WWID (see the next section) will fall
under multipath, and which device to ignore (blacklisting).

- starting the daemon:

# service multipathd start
# chkconfig multipathd on

As you can see, the daemon is "multipathd", which should show up
using the "ps -ef" command.

However, you should not start the daemon, before "/etc/multipath.conf"
is configured correctly.


30.6 How it works, and "/etc/multipath.conf":
---------------------------------------------

To edit the configuration file "/etc/multipath.conf" in the correct way
is crucial.

If we take a look again at our four devices, let's determine the WWID,
which is supposed to be a unique identifier for each device:

# scsi_id -g -u -s /block/sda
3600601601310310095182d72710de211

# scsi_id -g -u -s /block/sdb
3600601601310310095182d72710de211

# scsi_id -g -u -s /block/sdc
3600601601310310095182d72710de211

# scsi_id -g -u -s /block/sdd
3600601601310310095182d72710de211

You see? They all have the same identifier, that is, the same WWID !

The trick now really is, to place an entry in /etc/multipath.conf,
specifying the WWID which should fall under multipath, and specifying a
friendly name.

So, let's edit "/etc/multipath.conf":

root@zigzag /etc# vi multipath.conf

devnode_blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]"
        devnode "^cciss!c[0-9]d[0-9]*"
}

multipaths {
        multipath {
                wwid                    3600601601310310095182d72710de211
                alias                   ocfs2ora
                path_grouping_policy    failover
        }
}

There are 2 essential "parts" here.

The "devnode_blacklist" section tells the software which devices to ignore.
Here it means that all devices like raw,loop,fd,md,hd etc.., must be ignored.

The "multipaths" section, tells the daemon which devices have the same WWID,
and thus are the same, and it is exactly that device which falls under multipath.
Also, an alias is specified, which means the "device mapper" will create
the friendly name "/dev/mapper/ocfs2ora".

So, if we now want to use "fdisk", or "parted", or if we want to create a
Physical Volume (in LVM), we *should* now use this friendly name.

When the software is installed, you also have the "multipath" command available
with which you can check which paths and devices are actually the stuff
under the "alias".

In our example:

# multipath -l

ocfs2ora (3600601601310310095182d72710de211) dm-2 HP,HSV210
[size=300G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 0:0:0:0 sda 8:32  [active][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 0:0:1:0 sdb 8:96  [active][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:0:0 sdc 8:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:1:0 sdd 8:224 [active][undef]