/*****************************************************************************/ /* Document : Simplified descriptions and examples of some well known */ /* technical Architectures, or methods, in IT. */ /* Doc. Versie : 25 */ /* File : architectures.txt */ /* Date : 05-04-2009 */ /* Content : a few (hopefully) interesting views on some architectures. */ /* But it's geared somewhat towards popular platforms. */ /* Compiled by : Albert van der Sel */ /* Note : The independed sections are separated by */ /* three "###" lines, for easy identification. */ /* Best usage : Use find/search in your editor to search for a */ /* frase, identifier, command etc.., or on the literal */ /* text mentioned in the Contents (Section number + text). */ /*****************************************************************************/ Contents: Section 1. A Scetch of the CORBA architecture: (this is a very lightweight discussion) Section 2. (Traditional) Client connections to SQL Server Section 3. N-tier architectures: (1) Browser based Client connections to a Server Section 4. IPC: Named pipes, Sockets, and Multiprotocol in Windows Section 5. IPC in UNIX Section 6. "Traditional" Cluster system in Redhat Linux Section 7. Oracle 10g RAC example on Redhat Linux Section 8. CLUSTERS ON AIX: GPFS (also repeated in section 18) Section 9. CLUSTERS ON AIX: HACMP (also repeated in section 18) Section 10. Cisco IOS version 10.x, 11.x, 12.x router commands: Section 11. Basic VMS commands and Operations Section 12. NT/200x/XP CMD shell script examples Section 13. OO and C# elementary code fragments and basic DOT.NET theory Section 14. Basic PHP theory and examples Section 15. Extended PL/SQL examples and code snippets Section 16. Basic VB and VBscript code snippets Section 17. SQL Server 7 and 2000 system queries Section 18. BIG SECTION: UNIX: AIX, HP, Solaris, Linux commands and architecture SECTION 19. BIG SECTION: Oracle RDBMS 8,8i,9i,10g system queries and architecture section 20. How to trace in Unix Section 21. How to undelete a file in UNIX. Section 22. Oracle 10g/11g RAC Section 23. MQ errors and messages Section 24. A collection of Unix errorcodes ############################################################################################# ############################################################################################# ############################################################################################# ===================================================================================== Secton 1. A scetch of the CORBA architecture: (this is a very lightweight discussion) ===================================================================================== ---------- ------------------------- | CLIENT | | Object Implementation | ---------- ------------------------- | IDL | | IDL | |STUB | | Skeleton | ---------------------------------------------------- | | ------------- | | | --->--> | REQUEST | ---->-- | | ------------- | | OBJECT REQUEST BROKER (ORB) | ----------------------------------------------------- Fig 1: local connection ---------- --------- --------- ---------- | Client | |Object | |Client | |Object | ---------- --------- --------- ---------- |STUB| |SKEL| |STUB| |SKEL| -------------------------- ------------------------ | | ^ | | ^ | | |-> ORB 1-->--| |--->------------|---->-- ORB 2---| | | | IIOP | | -------------------------- protocol ------------------------ Fig 2: remote invocation CORBA is about distributed (networked) computing. You create "objects" and "Interface Definitions" (through IDL: Interface Definition Language), using a well defined infrastructure. (Theoretically) It is platform and OS independent. The Common Object Request Broker Architecture (CORBA) [OMG:95a] is an emerging open distributed object computing infrastructure, (being) standardized by the Object Management Group (OMG). CORBA automates many common network programming tasks such as object registration, location, and activation; request demultiplexing; framing and error-handling; parameter marshalling and demarshalling; and operation dispatching. At least theoratically, using the standard protocol IIOP, a CORBA-based program from any vendor, on almost any computer, operating system, programming language, and network, can interoperate with a CORBA-based program from the same or another vendor, on almost any other computer, operating system, programming language, and network. CORBA applications are composed of objects, individual units of running software that combine functionality and data, and that frequently (but not always) represent something in the real world. Typically, there are many instances of an object of a single type - for example, an e-commerce website would have many shopping cart object instances, all identical in functionality but differing in that each is assigned to a different customer, and contains data representing the merchandise that its particular customer has selected. For other types, there may be only one instance. When a legacy application, such as an accounting system, is wrapped in code with CORBA interfaces and opened up to clients on the network, there is usually only one instance. For each object type, such as the shopping cart that we just mentioned, you define an interface in OMG IDL. The interface is the syntax part of "the contract" that the server object offers to the clients that invoke it. Any client that wants to invoke an operation on the object must use this IDL interface to specify the operation it wants to perform, and to marshal the arguments that it sends. When the invocation reaches the target object, the same interface definition is used there to unmarshal the arguments so that the object can perform the requested operation with them. The interface definition is then used to marshal the results for their trip back, and to unmarshal them when they reach their destination. The IDL interface definition is independent of programming language, but maps to all of the popular programming languages via OMG standards: OMG has standardized mappings from IDL to C, C++, Java, COBOL, Smalltalk, Ada, Lisp, Python, and IDLscript. The interface to each object is defined very strictly. In contrast, the implementation of an object - its running code, and its data - is hidden from the rest of the system (that is, encapsulated) behind a boundary that the client may not cross. Clients access objects only through their advertised interface, invoking only those operations that that the object exposes through its IDL interface, with only those parameters (input and output) that are included in the invocation. Figure 1 shows how everything fits together, at least within a single process: You compile your IDL into client stubs and object skeletons, and write your object (shown on the right) and a client for it (on the left). Stubs and skeletons serve as proxies for clients and servers, respectively. Because IDL defines interfaces so strictly, the stub on the client side has no trouble meshing perfectly with the skeleton on the server side, even if the two are compiled into different programming languages, or even running on different ORBs from different vendors. How do remote invocations work? Figure 2 diagrams a remote invocation. In order to invoke the remote object instance, the client first obtains its object reference. (There are many ways to do this, but we won't detail any of them here. Easy ways include the Naming Service and the Trader Service.) To make the remote invocation, the client uses the same code that it used in the local invocation we just described, substituting the object reference for the remote instance. When the ORB examines the object reference and discovers that the target object is remote, it routes the invocation out over the network to the remote object's ORB. (Again we point out: for load balanced servers, this is an oversimplification.) (Note the similarity with OO "early binding": that is: at compile time, all options are then binded.) How does this work? OMG has standardized this process at two key levels: First, the client knows the type of object it's invoking (that it's a shopping cart object, for instance), and the client stub and object skeleton are generated from the same IDL. This means that the client knows exactly which operations it may invoke, what the input parameters are, and where they have to go in the invocation; when the invocation reaches the target, everything is there and in the right place. We've already seen how OMG IDL accomplishes this. Second, the client's ORB and object's ORB must agree on a common protocol - that is, a representation to specify the target object, operation, all parameters (input and output) of every type that they may use, and how all of this is represented over the wire. OMG has defined this also - it's the standard protocol IIOP. (ORBs may use other protocols besides IIOP, and many do for various reasons. But virtually all speak the standard protocol IIOP for reasons of interoperability, and because it's required by OMG for compliance.) Some examples of typical code: ------------------------------ Hello.idl, the interface definition The following file, Hello.idl, is written in the OMG Interface Definition Language, and describes a CORBA object whose sayHello() operation returns a string and whose shutdown() method shuts down the ORB. OMG IDL is a purely declarative language designed for specifying programming-language-independent operational interfaces for distributed applications. The IDL can be mapped to a variety of programming languages. The IDL mapping for Java is summarized in "IDL to Java Language Mapping Summary". Hello.idl module HelloApp { interface Hello { string sayHello(); oneway void shutdown(); }; }; ############################################################################################# ############################################################################################# ############################################################################################# ============================================================================= Section 2. (Traditional) Client connections to SQL Server: ============================================================================= ------- ------- ------- ------- |App 1| |App 2| |App 3| |App 4| ------- ------- ------- ------- | | | | | ------- ------- | | |ADO | |RDO | | | ------- ------- | | | | | ----------------- --------------- |OLE DB | |ODBC | (TabularDataStream TDS) ----------------- --------------- | | ----------------------------------- |Client Network library api | |- named pipes | |- tcpip sockets | |- multiprotocol | ----------------------------------- | | \ network stack: tcp/ip, spx/ipx etc.. ----------------_\-------------------------------------- \ \ | ----------------------------------- |SQL Server network library | ----------------------------------- | ----------------- |SQL Server | (TDS) ----------------- ############################################################################################# ############################################################################################# ############################################################################################# ================================================================================== Section 3. N-tier architectures: (1) Browser based Client connections to a Server: ================================================================================== Example 1: ASP -------------- Client WebServer + ASP engine ------------------------------ ------- request test.asp | <% if Hour(Now)< 12 then %>| | |---------------------> | Good Morning. | |Web | | <% else %> | |browser| test.htm returned | Good Day. | | | <------------------ | <% end if %> | ------ | ------------------------------ | | | | if its before 12.00 o'clock | V -----<------ Good Morning Example 2: jsp, servlets, java ------------------------------ .................................... . . optional backend DB http://.../x.jsp . ----------------- ---------- . ---------- ------------ . | jsp page | |BEANS or| . | | | BROWSER |-----------------> . | |<->|EJB |<------>|DataBase | | | . |Web Server | | | | | | |<----------------- . |JSP Engine | ---------- . ----------- ------------ . | | . . ------------------ . . . .................................... Example 3: N-tier architecture jsp, servlets, java, middleware -------------------------------------------------------------- .................................... . . optional backend DB http://.../x.jsp . ----------------- ---------- . ------------ ---------- ------------ . | jsp page | |BEANS or| . | | | | | BROWSER |-----------------> . | |<->|EJB |<--->|- Jolt/ |<-->|DataBase| | | . |Web Server | ---------- . | Tuxedo | | | | |<----------------- . |iAS, Websphere | . | middleware| ---------- ------------ . | |<---------------->|- Cobol obj | . ------------------ . -------------- . . .................................... ############################################################################################# ############################################################################################# ############################################################################################# ============================================================================= Section 4. IPC: Named pipes, Sockets, and Multiprotocol in Windows: ============================================================================= This section is Windows orientated. See Section 5 for a Unix or more general interpretation. 4.1 TCP/IP Sockets: ------------------- Suppose the Server "10.10.10.1", has multiple Server programs running. How does a client differentiate between the multiple Server programs? The usual way with tcpip is the use of sockets. A socket is an "identifier" completely identifying the location of a Server on the network, as well as the "port" the server service is listening on, like for example: 10.10.10.1 : 1521 or for example 10.10.10.1 : 1433 The client should have knowledge of the "port" of the desired Host program or the host service is listening on. For example it could come from a local services file, or some registry. The client constructs a tcp header, while in the destination port, the port is listed where the Host Server service or deamon is listening on. Server, IP=10.10.10.1 |------------------------------------------------ | | | ---------- --------------- | | | Oracle | | SQL server | | | ---------- --------------- | | | | | | ------------------ --------------------- | | |Oracle listener | |SQL Server listener | | | |listening on port | |listening on port | | | |1521 | | 1433 | | | ------------------ -------------------- | | ^ ^ | | | | | client request for | 1521| 1433| | connection to Oracle | | | | 10.10.10.1:1521 | ------------------------------- | ----------------------> | |Portmapper / Netlib router | | | |or other "service", handling | | Client request for | |requests to the desired host | | connection to SQL Server| |program | | 10.10.10.1:1433 | | | | ----------------------> | | | | | ------------------------------ | | | |------------------------------------------------ 4.2 Named pipes: ---------------- A high level process, like a client program, can open and write to a "special file", the "named pipe". The named pipe can be considered to be at the OSI layer 7, and is an IPC mechanism for process to process communication, locally or across a network. In Windows, the design of named pipes is biased towards client-server communication, and they work much like sockets: other than the usual read and write operations, Windows named pipes also support an explicit "passive" mode for server applications (compare: UNIX domain sockets). Named pipes aren't permanent and can't be created as special files on any writable filesystem, unlike in UNIX, but are volatile names (freed after the last reference to them is closed) allocated in the root directory of the named pipe filesystem (NPFS), mounted under the special path \\.\pipe\ (that is, a pipe named "foo" would have a full path name of \\.\pipe\foo). Anonymous pipes used in pipelining actually are named pipes with a random name. In "constructing" the client program (VB, C++, VB.NET, C# etc...) there is some sort of mechanisme to create a named pipe, for example: Public Declare Function CallNamedPipe Lib "kernel32" Alias "CallNamedPipeA" _ (ByVal lpNamedPipeName As String, etc...... The pipe is an IPC construct above any network protocol as sockets/tcp/ip, or nwlink spx/ipx etc.. It uses the IPC$ share of the remote system, just like a filesystemshare. \\computername\pipe\MSSQL$instancename\sql\query CLIENT: ---------------------------------------- rw to and from pipe named pipe \\.\sql\query, <-------------------------> Server named pipe functions like a sort of URL or share ---------------------------------------- session management, sockets, netbios ---------------------------------------- TCP SPX ---------------------------------------- IP IPX ---------------------------------------- Datalink ---------------------------------------- physiscal network ---------------------------------------- 4.3 Multiprotocol: ------------------ It's a protocol that layers over named pipes, tcpip sockets, or nwlink spx/ipx sockets. So, just MUST have one of the above IPC mechanismens available. The Multiprotocol selection has two key features: Automatic selection of an available network protocol to communicate with an instance of Microsoft® SQL Server™. This is convenient when you want to connect to multiple servers running different network protocols but do not want to reconfigure the client connection for each server. If the client and server Net-Libraries for TCP/IP Sockets, NWLink IPX/SPX, or Named Pipes are installed on the client and server, the Multiprotocol Net-Library will automatically choose the first available network protocol to establish a connection. Client encryption. You can enforce encryption over the Multiprotocol Net-Library on clients running on the Microsoft Windows NT® 4.0, Windows® 2000, Windows 95, or Windows 98 operating system to prevent others from intercepting and viewing sensitive data. The Multiprotocol Net-Library takes advantage of the remote procedure call (RPC) facility of Windows NT 4.0 and Windows 2000, which provides Windows Authentication. For the Multiprotocol Net-Library, clients determine the server address using the server name. Usage Considerations Before using the Multiprotocol Net-Library, consider the following: The Multiprotocol Net-Library does not support named instances of SQL Server 2000. You can use the Multiprotocol Net-Library to connect to the default instance of SQL Server on a computer, but you cannot connect to any named instances. The Multiprotocol Net-Library does not support server enumeration. From applications that can list servers by calling dbserverenum, you cannot identify servers running an instance of SQL Server and listening on the Multiprotocol Net-Library. ############################################################################################# ############################################################################################# ############################################################################################# ============================================================================= 5. Section IPC in UNIX: ============================================================================= ############################################################################################# ############################################################################################# ############################################################################################# ============================================================================= Section 6. "Traditional" Cluster system in Redhat Linux ============================================================================= The Red Hat "Cluster Manager" software was originally based on the open source Kimberlite http://oss.missioncriticallinux.com/kimberlite/ cluster project which was developed by Mission Critical Linux, Inc. Subsequent to its inception based on Kimberlite, developers at Red Hat have made a large number of enhancements and modifications. 6.1 Cluster Overview: To set up a cluster, an administrator must connect the cluster systems (often referred to as member systems) to the cluster hardware, and configure the systems into the cluster environment. The foundation of a cluster is an advanced host membership algorithm. This algorithm ensures that the cluster maintains complete data integrity at all times by using the following methods of inter-node communication: • Quorum partitions on shared disk storage to hold system status • Ethernet and serial connections between the cluster systems for heartbeat channels To make an application and data highly available in a cluster, the administrator must configure a "cluster service" — a discrete group of service properties and resources, such as an application and shared disk storage. A service can be assigned an IP address to provide transparent client access to the service. For example, an administrator can set up a cluster service that provides clients with access to highly-available database application data. Both cluster systems can run any service and access the service data on shared disk storage. However, each service can run on only one cluster system at a time, in order to maintain data integrity. Administrators can set up - an "active-active" configuration in which both cluster systems run different services, or - an "active-passive" (hot-standby) configuration in which a primary cluster system runs all the services, and a backupcluster system takes over only if the primary system fails. NOTE: So this is actually a difference from Oracle 10g Real Application Cluster (RAC), where both instances, or multiple instances (from 2 - 100), accesses the single database on shared storage, at the same time ! Scetch of a 2-node Linux cluster ------------------------------------------ public network | | | | ------------ ------------- |cluster | |cluster | |system |Ethernet |system | | |--------------------| | | |heartbeat | | | | | | | |____________ | | |ServiceA | ----- -|--- | | |ServiceB |--|PWR| |PWR|----|ServiceC | | | ----- ----- | | | | |_______________| | | | | | ------------ ------------- | SCSI bus or Fible Channel | ------------------ -------------- Interconnect | | | | Fig 6.1 ----------- |Shared | - has Quorum partition |Disk | - has partitions for ServiceA, B, C |Storage | ----------- Figure 6–1, shows an example of a cluster in an active-active configuration. If a hardware or software failure occurs, the cluster will automatically restart the failed system’s services on the functional cluster system. This service failover capability ensures that no data is lost, and there is little disruption to users. When the failed system recovers, the cluster can re-balance the services across the two systems. In addition, a cluster administrator can cleanly stop the services running on a cluster system and then restart them on the other system. This service relocation capability enables the administrator to maintain application and data availability when a cluster system requires maintenance. -- Service configuration framework: Clusters enable an administrator to easily configure individual services to make data and applications highly available. To create a service, an administrator specifies the resources used in the service and properties for the service, including the service name, application start and stop script, disk partitions, mount points, and the cluster system on which an administrator prefers to run the service. After the administrator adds a service, the cluster enters the information into the cluster database on shared storage, where it can be accessed by both cluster systems. The cluster provides an easy-to-use framework for database applications. For example, a database service serves highly-available data to a database application. The application running on a cluster system provides network access to database client systems, such as Web servers. If the service fails over to another cluster system, the application can still access the shared database data. A network-accessible database service is usually assigned an IP address, which is failed over along with the service to maintain transparent access for clients. The cluster service framework can be easily extended to other applications, as well. -- Multiple cluster communication methods: To monitor the health of the other cluster system, each cluster system monitors the health of the remote power switch, if any, and issues heartbeat pings over network and serial channels to monitor the health of the other cluster system. In addition, each cluster system periodically writes a timestamp and cluster state information to two quorum partitions located on shared disk storage. System state information includes whether the system is an active cluster member. Service state information includes whether the service is running and which cluster system is running the service. Each cluster system checks to ensure that the other system’s status is up to date. To ensure correct cluster operation, if a system is unable to write to both quorum partitions at startup time, it will not be allowed to join the cluster. In addition, if a cluster system is not updating its timestamp, and if heartbeats to the system fail, the cluster system will be removed from the cluster. If a hardware or software failure occurs, the cluster will take the appropriate action to maintain application availability and data integrity. For example, if a cluster system completely fails, the other cluster system will restart its services. Services already running on this system are not disrupted. When the failed system reboots and is able to write to the quorum partitions, it can rejoin the cluster and run services. Depending on how the services are configured, the cluster can re-balance the services across the two cluster systems. -- Manual service relocation capability: In addition to automatic service failover, a cluster enables administrators to cleanly stop services on one cluster system and restart them on the other system. This allows administrators to perform planned maintenance on a cluster system, while providing application and data availability. -- Event logging facility: To ensure that problems are detected and resolved before they affect service availability, the cluster daemons log messages by using the conventional Linux syslog subsystem. Administrators can customize the severity level of the logged messages. -- Application Monitoring: The cluster services infrastructure can optionally monitor the state and health of an application. In this manner, should an application-specific failure occur, the cluster will automatically restart the application. In response to the application failure, the application will attempt to be restarted on the member it was initially running on; failing that, it will restart on the other cluster member. -- Status Monitoring Agent: A cluster status monitoring agent is used to gather vital cluster and application state information. This information is then accessible both locally on the cluster member as well as remotely. A graphical user interface can then display status information from multiple clusters in a manner which does not degrade system performance. 6.2 Notes about Shared Storage: The operation of the cluster depends on reliable, coordinated access to shared storage. In the event of hardware failure, it is desirable to be able to disconnect one member from the shared storage for repair without disrupting the other member. Shared storage is truly vital to the cluster configuration. Testing has shown that it is difficult, if not impossible, to configure reliable multi-initiator parallel SCSI configurations at data rates above 80 MBytes/sec. using standard SCSI adapters. Further tests have shown that these configurations can not support online repair because the bus does not work reliably when the HBA terminators are disabled, and external terminators are used. For these reasons, multi-initiator SCSI configurations using standard adapters are not supported. Single-initiator parallel SCSI buses, connected to multi-ported storage devices, or Fibre Channel, are required. The Red Hat Cluster Manager requires that both cluster members have simultaneous access to the shared storage. Certain host RAID adapters are capable of providing this type of access to shared RAID units. These products require extensive testing to ensure reliable operation, especially if the shared RAID units are based on parallel SCSI buses. These products typically do not allow for online repair of a failed system. No host RAID adapters are currently certified with Red Hat Cluster Manager. Refer to the Red Hat web site at http://www.redhat.com for the most up-to-date supported hardware matrix. The use of software RAID, or software Logical Volume Management (LVM), is not supported on shared storage. This is because these products do not coordinate access from multiple hosts to shared storage. Software RAID or LVM may be used on non-shared storage on cluster members (for example, boot and system partitions and other filesysytems which are not associated with any cluster services). 6.3 More detailed view of an almost "No single point of failure" 2-Node Clustered System: ---------- |NETWORK | -------------------|SWITCH |----------------------- | ---------- | | | | --------------------- | --------------------- |network interface | ---------- |network interface | |-------------------- |terminal| |-------------------- |serial port |------|server |--------------|serial port | |-------------------- ---------- |-------------------- |CLUSTER | |CLUSTER | |SYSTEM | |SYSTEM | |-------------------- |-------------------- |network interface |------------------------------|network interface | |-------------------- |-------------------- |serial port |------------------------------|serial port | |-------------------- |-------------------- |serial port |-----------------\ | | |-------------------- ----- ----- |-------------------- |power plug |---|PWR| |PWR|----------|power plug | |-------------------- ----- ----- |-------------------- | | | |-------------------| | | -------------------------|serial port | |-------------------- |-------------------- |SCSI adapter (T) | |SCSI adapter (T) | --------------------- --------------------- | | | | ----------- ----------------------------------- | | | | (T) (T) ------------------------------------------------------- | Port A/in | Port B/in | | Port A/Out| Port B/Out | |------------------------------------------------------ | | | | | | | ------------------- -------------------- | | |controller 1 | |controller 2 | | | ------------------- -------------------- | | | | | | | | | RAID | ( ) ( ) | | | | | | ( ) ( ) | | | | mirrored shared disks | ------------------------------------------------------- single-initiator ---------------- A single-initiator SCSI bus has only one node connected to it, and provides host isolation and better performance than a multi-initiator bus. Single-initiator buses ensure that each node is protected from disruptions due to the workload, initialization, or repair of the other nodes. When using a single- or dual-controller RAID array that has multiple host ports and provides simultaneous access to all the shared logical units from the host ports on the storage enclosure, the setup of the single-initiator SCSI buses to connect each cluster node to the RAID array is possible. If a logical unit can fail over from one controller to the other, the process must be transparent to the operating system. Note that some RAID controllers restrict a set of disks to a specific controller or port. In this case, single-initiator bus setups are not possible. To set up a single-initiator SCSI bus configuration, perform the following steps: Enable the onboard termination for each host bus adapter. Enable the termination for each RAID controller. Use the appropriate SCSI cable to connect each host bus adapter to the storage enclosure. Setting host bus adapter termination is done in the adapter BIOS utility during system boot. To set RAID controller termination, refer to the vendor documentation. --------- SI SCSI bus -------------- | T|--------------- | HBA | |HBA | | ----------|T | | | | | -------------- --------- | | | | ------------------- | T T | |Storage Enclosure| ------------------- In general, recommended in Linux an Sun clusters. Multi Initiator SCSI -------------------- Multi Initiator SCSI configurations are configurations with two SCSI host adapter boards connect to a single SCSI bus like in the following example: -------- SI SCSI bus -------------- | T|-------------------------------- |T | | | | | | |HBA | | |HBA | | | | | | --------- | --------------- ------------------- | T | |Storage Enclosure| ------------------- In general, not recommended for Linux or Solaris clusters. ############################################################################################# ############################################################################################# ############################################################################################# ============================================================================= 7. Oracle 10g RAC example on Redhat Linux: ============================================================================= 7.1 Overview: ------------- - RAC Architecture Overview Let's begin with a brief overview of RAC architecture. A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to perform the same task. A RAC database is 2 or more instances running on a set of clustered nodes, with all instances accessing a shared set of database files. Depending on the O/S platform, a RAC database may be deployed on a cluster that uses vendor clusterware plus Oracle's own clusterware (Cluster Ready Services), or on a cluster that solely uses Oracle's own clusterware. Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the primary tool DBAs use to configure CRS for their RAC database and processes. - Cluster Ready Services and the OCR Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own clusterware. On most platforms, Oracle supports vendor clusterware; in these cases, CRS interoperates with the vendor clusterware, providing high availability support and service and workload management. On Linux and Windows clusters, CRS serves as the sole clusterware. In all cases, CRS provides a standard cluster interface that is consistent across all platforms. CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks: the Oracle Cluster Registry (OCR), and the voting disk. CRS manages the following resources: . The ASM instances on each node . Databases . The instances on each node . Oracle Services on each node . The cluster nodes themselves, including the following processes, or "nodeapps": . VIP . GSD . The listener . The ONS daemon CRS stores information about these resources in the OCR. If the information in the OCR for one of these resources becomes damaged or inconsistent, then CRS is no longer able to manage that resource. Fortunately, the OCR automatically backs itself up regularly and frequently. 10g RAC (10.2) uses, or depends on,: - Oracle Clusterware (10.2), formerly referred to as CRS "Cluster Ready Services" (10.1). - Oracle's optional Cluster File System OCFS (This is optional), or use ASM and RAW. - Oracle Database extensions RAC is "scale out" technology: just add commodity nodes to the system. The key component is "cache fusion". Data are transferred from one node to another via very fast interconnects. Essential to 10g RAC is a "Shared Cache" technology. Automatic Workload Repository (AWR) plays a role also. The Fast Application Notification (FAN) mechanism that is part of RAC, publishes events that describe the current service level being provided by each instance, to AWR. The load balancing advisory information is then used to determine the best instance to serve the new request. . With RAC, ALL Instances of ALL nodes in a cluster, access a SINGLE database. . But every instance has it's own UNDO tablespace, and REDO logs. The Oracle Clusterware comprise several background processes that facilitate cluster operations. The Cluster Synchronization Service CSS, Event Management EVM, and Oracle Cluster components communicate with other cluster components layers in the other instances within the same cluster database environment. Questions per implementation arise in the following points: . Storage . Computer Systems/Storage-Interconnect . Datbase . Application Server . Public and Private networks . Application Control & Display On the Storage level, it can be said that 10g RAC supports - Automatic Storage Management (ASM) - Oracle Cluster File System (OCFS) - ??? Network File System (NFS) - limited (only theoretical actually) - Disk raw partitions - Third party cluster file systems For application control and tools, it can be said that 10g RAC supports - OEM Grid Control http://hostname:5500/em OEM Database Control http://hostname:1158/em - "svrctl" is a command line interface to manage the cluster configuration, for example, starting and stopping all nodes in one command. - Cluster Verification Utility (cluvfy) can be used for an installation and sanity check. Failure in Client connections: Depending on the Net configuration, type of connection, type of transaction etc.., Oracle Net services provides a feature called "Transparant Application Failover" which can fail over a client session to another backup connection. About HA and DR: - RAC is HA , High Availability, that will keep things Up and Running in one site. - Data Guard is DR, Disaster Recovery, and is able to mirror one site to another remote site. 7.2 Prepare your nodes: ----------------------- 7.2.1 Scetch of a 2-node Linux cluster 192.168.2.0 ------------------------------------------ public network | | | | ------------ ------------- |InstanceA |Private network |InstanceB | | |Ethernet | | | |--------------------| | | |192.168.1.0 | | | | | | | |____________ | | | | ----- -|--- | | | |--|PWR| |PWR|----| | | | ----- ----- | | | | |_______________| | | | | | ------------ ------------- | SCSI bus or Fible Channel | ------------------ -------------- Interconnect | | | | Fig 7.1 ----------- |Shared | - has Single DB on ASM or OCFS or RAW |Disk | - has OCR and Voting disk on OCFS or RAW |Storage | ----------- 7.2.2 Storage Options Storage Oracle Clusterware Database Recovery area -------------- ------------------ -------- ------------- Automatic Storage Management No Yes Yes Cluster file system (OCFS) Yes Yes Yes Shared raw storage Yes Yes No In the following, we will do an example installation on 3 nodes. 7.2.3 Install Redhat on all nodes with all options. 7.2.4 create oracle user and groups dba, oinstall on all nodes. Make sure they all have the same UID and GUI. 7.2.5 Make sure the user oracle has an appropriate .profile or .bash_profile 7.2.6 Every node needs a private network connection and a public network connection (at least two networkcards). 7.2.7 Linux kernel parameters: Most out of the box kernel parameters (of RHELS 3,4,5) are set correctly for Oracle except a few. You should have the following minimal configuration: net.ipv4.ip_local_port_range 1024 65000 kernel.sem 250 32000 100 128 kernel.shmmni 4096 kernel.shmall 2097152 kernel.shmmax 2147483648 fs.file-max 65536 You can check the most important parameters using the following command: # /sbin/sysctl -a | egrep 'sem|shm|file-max|ip_local' net.ipv4.ip_local_port_range = 1024 65000 kernel.sem = 250 32000 100 128 kernel.shmmni = 4096 kernel.shmall = 2097152 kernel.shmmax = 2147483648 fs.file-max = 65536 If some value should be changed, you can change the "/etc/sysctl.conf" file and run the "/sbin/sysctl -p" command to change the value immediately. Every time the system boots, the init program runs the /etc/rc.d/rc.sysinit script. This script contains a command to execute sysctl using /etc/sysctl.conf to dictate the values passed to the kernel. Any values added to /etc/sysctl.conf will take effect each time the system boots. 7.2.8 make sure ssh and scp are working on all nodes without asking for a password. Use shh-keygen to arrange that. 7.2.9 Example "/etc/host" on the nodes: Suppose you have the following 3 hosts, with their associated public and private names: public private oc1 poc1 oc2 poc2 oc3 poc3 Then this could be a valid host file on the nodes: 127.0.0.1 localhost.localdomain localhost 192.168.2.99 rhes30 192.168.2.166 oltp 192.168.2.167 mw 192.168.2.101 oc1 #public1 192.168.1.101 poc1 #private1 192.168.2.19 voc1 #virtual1 192.168.2.102 oc2 #public2 192.168.1.102 poc2 #private2 192.168.2.177 voc2 #virtual2 192.168.2.103 oc3 #public3 192.168.1.103 poc3 #private3 192.168.2.178 voc3 #virtual3 7.2.10 Example disk devices On all nodes, the shared disk devices should be accessible through the same devices names. Raw Device Name Physical Device Name Purpose /dev/raw/raw1 /dev/sda1 ASM Disk 1: +DATA1 /dev/raw/raw2 /dev/sdb1 ASM Disk 1: +DATA1 /dev/raw/raw3 /dev/sdc1 ASM Disk 2: +RECOV1 /dev/raw/raw4 /dev/sdd1 ASM Disk 2: +RECOV1 /dev/raw/raw5 /dev/sde1 OCR Disk (on RAW device) /dev/raw/raw6 /dev/sdf1 Voting Disk (on RAW device) 7.3 CRS installation: --------------------- 7.3.1 First install CRS in its own home directory First install CRS in its own home directory, e.g. CRS10gHome, apart from the Oracle home dir. As Oracle user: ./runInstaller --------------------------------------------------- | | Screen 1 |Specify File LOcations | | | |Source | |Path: /install/crs10g/Disk1/stage/products.xml | | | |Destination | |Name: CRS10gHome | |Path: /u01/app/oracle/product/10.1.0/CRS10gHome | | | --------------------------------------------------- --------------------------------------------------- | | Screen 2 |Cluster Configuration | | | |Cluster Name: lec1 | | | | Public Node Name Private Node Name | | --------------------------------------------- | | |oc1 | p0c1 | | | |-------------------------------------------- | | |oc2 | p0c2 | | | |-------------------------------------------- | | |oc3 | poc3 | | | |-------------------------------------------- | --------------------------------------------------- In the next screen, you specify which of your networks is to be used as the public interface (to connect to the public network) and which will be used for the private interconnect to support cache fushion and the cluster heartbeat. --------------------------------------------------- | | Screen 3 |Private Interconnect Enforcement | | | | | | | | Interface Name Subnet Interface type | | --------------------------------------------- | | |eth0 |192.168.2.0 |Public | | | |-------------------------------------------- | | |eth1 |192.168.1.0 |Private | | | |-------------------------------------------- | | | --------------------------------------------------- In the next screen, you specify /dev/raw/raw5 as the raw disk for the Oracle Cluster Registry. --------------------------------------------------- | | Screen 4 |Oracle Cluster Registry | | | |Specify OCR Location: /dev/raw/raw5 | | | --------------------------------------------------- In a similar fashion you specify the location of the Voting Disk. --------------------------------------------------- | | Screen 5 |Voting Disk | | | |Specify Voting Disk: /dev/raw/raw6 | | | --------------------------------------------------- You now have to execute the /u01/app/oracle/orainventory/orainstRoot.sh script on all Cluster Nodes as the root user. After this, you can continue with the other window, and see an "Install Summary" screen. No you click "Install" and the installation begins. Apart from the node you work on, the software will also be copied to the other nodes as well. After the installation is complete, you are once again prompted to run a script as root on each node of the Cluster. This is the script "/u01/app/oracle/product/10.1.0/CRS10gHome/root.sh". -- The olsnodes command. After finishing the CSR installation, you can verify that the installation completed successfully by running on any node the following command: # cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin # olsnodes -n oc1 1 oc2 2 oc3 3 7.4 Database software installation: ----------------------------------- You can install the database software into the same directory in each node. With OCFS2, you might do one install in a common shared directory for all nodes. Because CSR is already running, the OUI detects that, and because its cluster aware, it provides you with the options to install a clustered implementation. You start the installation by running ./runInstaller as the oracle user on one node. For most part, it looks the same as a single-instance installation. After the file location screen, that is source and destination, you will see this screen: --------------------------------------------------- | | |Specify Hardware Cluster Installation Mode | | | | o Cluster installation mode | | | | Node name | | --------------------------------------------- | | | [] oc1 | | | | [] oc2 | | | | [] oc3 | | | --------------------------------------------- | | | | o Local installation (non cluster) | | | |-------------------------------------------------| Most of the time, you will do a "software only" installation, and create the database later with the DBCA. For the first node only, after some time, the Virtual IP Configuration Assistant, VIPCA, will start. Here you can configure the Virtual IP adresses you will use for application failover and the Enterprise Manager Agent. Here you will select the Virtual IP's for all nodes. VIPCA only needs to run once per Cluster. 7.5 Creating the RAC database with DBCA: ---------------------------------------- Launching the DBCA for installing a RAC database is much the same as launching DBCA for a single instance. If DBCA detects cluster software installed, it gives you the option to install a RAC database or a single instance. as oracle user: % dbca & --------------------------------------------------- | | |Welcome to the database configuration assistant | | | | | | | | o Oracle Real Application Cluster database | | | | o Oracle single instance database | | | |-------------------------------------------------| After selecting RAC, the next screen gives you the option to select nodes: --------------------------------------------------- | | |Select the nodes on which you want to create | |the cluster database. The local node oc1 will | |always be used whether or not it is selected. | | | | Node name | | --------------------------------------------- | | | [] oc1 | | | | [] oc2 | | | | [] oc3 | | | --------------------------------------------- | | | | | |-------------------------------------------------| In the next screens, you can choose the type of database (oltp, dw etc..), and all other items, just like a single instance install. At a cetain point, you can choose to use ASM diskgroups, flash-recovery area etc.. 7.5 Example tnsnames.ora and listener.ora: ------------------------------------------ 7.6 RAC utilities: ------------------ Some examples will illustrate the use of some important utilities. Example 1: removing and adding a failed node -------------------------------------------- Suppose, using above example, that instance rac3 on node oc3, fails. Suppose that you need to repair the node (e.g. harddisk crash). -- Remove the instance: % srvctl remove instance -d rac -i rac3 Remove instance rac3 for the database rac (y/n)? y -- Remove the node from the cluster: # cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin # ./olsnode -n oc1 1 oc2 2 oc3 3 # cd ../install # ./rootdeletenode.sh oc3,3 # cd ../bin # ./olsnode -n oc1 1 oc2 2 # Suppose that you have repared host oc3. We now want to add it back into the cluster. Host oc3 has the OS newly installed, and its /etc/host file is just like it is on the other nodes. -- Add the node at the clusterware layer: From oc1 or oc2, go to the $CRS_Home/oui/bin directory, and run # ./addNode.sh A graphical screen pops up, and you are able to add oc3 to the cluster. Al CRS files are copied to the new node. To start the services on the new node, you are then prompted to run "rootaddnode.sh" on the active node and "root.sh" on the new node. # ./rootaddnode.sh # ssh oc3 # cd /u01/app/oracle product/10.1.0/CRS10gHome # ./root.sh -- Install the Oracle software on the new node: Example 2: showing all nodes from a node ---------------------------------------- # lsnodes -v # cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin # ./olsnode -n oc1 1 oc2 2 oc3 3 Example 3: using svrctl ----------------------- The Server Control SVRCTL utility is installed on each node by default. You can use SRVCTL to start and stop the database and instances, manage configuration information, and to move or remove instances and services. Some SVRCTL operations store configuration information in the OCR. SVRCTL performs other operations, such as starting and stopping instances, by sending request to the Oracle Clusterware process CSRD, which then starts or stops the Oracle Clusterware resources. srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is srvctl [options] where command is one of enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps. -- Example 1: To view help: % svrctl -h % svrctl command -h -- Example 2: To see the SRVCTL version number, enter % svrctl -V -- Example 3. Bring up the MYSID1 instance of the MYSID database. % srvctl start instance -d MYSID -i MYSID1 -- Example 4. Stop the MYSID database: all its instances and all its services, on all nodes. % srvctl stop database -d MYSID -- Example 5. Stop the nodeapps on the myserver node. NB: Instances and services also stop. % srvctl stop nodeapps -n myserver -- Example 6. Add the MYSID3 instance, which runs on the myserver node, to the MYSID clustered database. % srvctl add instance -d MYSID -i MYSID3 -n myserver -- Example 7. Add a new node, the mynewserver node, to a cluster. % srvctl add nodeapps -n mynewserver -o $ORACLE_HOME -A 149.181.201.1/255.255.255.0/eth1 (The -A flag precedes an address specification.) -- Example 8. To change the VIP (virtual IP) on a RAC node, use the command % srvctl modify nodeapps -A new_address -- Example 9. Find out whether the nodeapps on mynewserver are up. % srvctl status nodeapps -n mynewserver VIP is running on node: mynewserver GSD is running on node: mynewserver Listener is not running on node: mynewserver ONS daemon is running on node: mynewserver -- Example 10. The following command and output show the expected configuration for a three node database called ORCL. % srvctl config database -d ORCL server01 ORCL1 /u01/app/oracle/product/10.1.0/db_1 server02 ORCL2 /u01/app/oracle/product/10.1.0/db_1 server03 ORCL3 /u01/app/oracle/product/10.1.0/db_1 -- Example 11. Disable the ASM instance on myserver for maintenance. % srvctl disable asm -n myserver -- Example 12. Debugging srvctl Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable. % export SRVM_TRACE=true -- Example 13. Question Version 10G RAC Q: how to add a listener to the nodeapps using the srvctl command ?? or even if it can be added using srvctl ?? A: just edit listener.ora on all concerned nodes and add entries ( the usual way). srvctl will automatically make use of it. For example % srvctl start database -d SAMPLE will start database SAMPLE and its associated listener LSNR_SAMPLE. -- Example 14. Adding services. % srvctl add database -d ORCL -o /u01/app/oracle/product/10.1.0/db_1 % srvctl add instance -d ORCL -i ORCL1 -n server01 % srvctl add instance -d ORCL -i ORCL2 -n server02 % srvctl add instance -d ORCL -i ORCL3 -n server03 -- More examples % srvctl remove instance -d rac -i rac3 % srvctl disable instance -d orcl -i orcl2 % srvctl enable instance -d orcl -i orcl2 ############################################################################################# ############################################################################################# ############################################################################################# ============================================== Sections 8 and 9: CLUSTERS ON AIX: ============================================== Section 8: GPFS ======================================== 8.1. General Parallel File System (GPFS): ======================================== Only AIX and Linux (pSeries) related. General Parallel File System (GPFS) is a high performance "shared-disk file system" that can provide data access from nodes in a cluster environment. Parallel and serial applications can readily access shared files using standard UNIX® file system interfaces, and the same file can be accessed concurrently from multiple nodes. GPFS is designed to provide high availability through logging and replication, and can be configured for failover from both disk and server malfunctions. GPFS operates often within the context of a HACMP cluster, but you can build just GPFS "clusters" as well. 8.2. Creating a 2 node GPFS Cluster: ==================================== Suppose we have two nodes named node2 and node3. Our goal is to create a single GPFS filesystem, named "/my_gpfs", consisting of 2 disks used for data and metadata. These disks are housed by two DS4300 storage subsystems. A tiebreaker disk, in a seperate DS4100, will be used to maintain node quorom during single nodes failures. Additionally, a "filesystem descriptor" disk for /my_gpfs is located at the same site. Servers: 2 Nodes= 2 x lpar; per lpar 1 cpu, 2GB RAM, 2 x FC adapter, 2 x Ethernet adapter Storage: 2 x DS4300 for GPFS and data, 1 x DS4100 for tiebreaker disk Suppose further that the nodes uses the following IP addresses: Node2: 10.1.1.32 Node3: 10.1.1.33 The Ethernet adapters per Server, are Aggregated, or configured in NIB (backup standby mode). Note : What are Tiebreaker disks? GPFS can use two types op quorum mechanisms in order to determine service availability: - Disk quorom - Node quorom In case availability of either of these resources is less or equal to 50%, GPFS file system services are automatically stopped. When node quorom is not met, GPFS stops its cluster-wide services and access to all filesystems within the cluster is no longer possible. If less than 50% of disks serving a GPFS file system fail, disk quorom, that is the number of "filesystem descriptors" for that particular file system, is no longer met and the filesystem will be unmounted. To eliminate the need of a tiebreaker node, as from GPFS 2.3, a new node quorom mechanism was introduced for a two node cluster. Its called a tiebreaker disk. If one of the two nodes goes down, we still have "enough" node qourom to keep the GPFS system running. Basically, a tiebreaker disk replaces a "tiebreaker node". -- Preparations: -- ------------- 1. The systems have AIX >= 5.3ML2 installed, and gpfs.base.xxxx installed 2. Make sure names resolution is ok, either by DNS or by /etc/hosts 3. Sync the system clocks, for example by NTP 4. Make sure rcp, ssh, scp is working (via ./rhosts etc.. or ssh protocols) 5. A distributed shell (DSH) is installed on each node. 6. During cluster setup some configuration files may be created and used with GPFS commands. These files reside in a user created directory called /var/mfs/conf. -- Creating the GPFS cluster: -- --------------------------- The first step is to create a GPFS cluster named TbrCl using the command: # mmcrcluster -n /var/mmfs/conf/nodefile -p node2 -s node3 -C TbrCl -A A file called "nodefile" contains the cluster node information, describing the function of each node: # Node2 can be a file system manager and is relevant for GPFS quorum node2:manager-quorom # Node3 can be a file system manager and is relevant for GPFS quorum node3:manager-quorom Each node can fullfill the function of a file system manager and is relevant for maintaining node quorom. A GPFS cluster designates a primary cluster manager (node2) and appoints a backup (node3) in case the primary fails. Cluster services will be started automatically during node boot (-A). After successfully creating the cluster, you can verify your setup: # mmlscluster GPFS cluster information ======================== GPFS cluster name: TbrCl.node2 GPFS cluster id: 720858653441148399 GPFS UID domain: TbrCl.node2 Remote shell command: /usr/bin/rsh Remote file copy command: /usr/bin/rcp GPFS cluster configuration servers: ----------------------------------- Primary server: node2 Secondary server: node3 Node number Node name IP address Full node name Remarks ------------------------------------------------------------- 1 node2 10.1.1.32 node2 quorom node 2 node3 10.1.1.33 node3 quorom node The GPFS daemon has to be started on all nodes: # mmstartup -a With GPFS you can administer the whole cluster from any cluster node. After starting GPFS services you should examine the state of the cluster: # mmgetstate -aL Node number Node name Quorom Nodes up Total nodes GPFS state ------------------------------------------------------------- 1 node2 2 2 2 active 2 node3 2 2 2 active At this point, the cluster software is running, but you haven't done anything yet on the filesystems. -- Configuring GPFS disks -- ---------------------- Before starting with the configuration of GPFS disks, you have to make sure that each cluster node has access to each SAN attached disk when running in a shared disk environment. With AIX 5L, you can use the lspv command to verify your disks (hdisk) are properly configured: # lspv hdisk2 none none hdisk3 none none hdisk4 none none hdisk5 none none If you look for LUN related information (e.g. volume names) issue the following command against a dedicated hdisk: # lsattr -El hdisk2 .. .... (in the output, you will also see SAN stuff) .. Its very important to keep a well balanced disk configuration when using GPFS because this makes sure you get optimal performance by distributing I/O requests evenly among storage subsystems and attached data disks. Keep in mind that all GPFS disks belonging to a particular file system should be of same size. GPFS uses a mechanism called Network Shared Disk (NSD) to provide file system access to cluster nodes, which do not have direct physical access to file system disks. A diskless node accesses an NSD via the cluster network and I/O operations are handled as if they run against a directly attached disk from an operating systems perspective. A special device driver handles data shipping using the cluster network. NSDs can also be used in a purely SAN based GPFS configuration where each node can directly access any disk. In case a node looses direct disk access, it automatically switches to NSD-mode, sending I/O requests via network to other direct direct disk attached nodes. This mechanism increases file system availability, and should normally be used. When using NSD, a primary and a backup server are assigned to each NSD. In case a node looses its direct disk attachment, it contacts the primary NSD server, or backup server in case the primary is not available. In order to establish NSD you need to create "descriptor files" in order to describe each disk functionality. In our example, we will use the following file: /var/mmfs/conf/diskfile #Description of disk attributes #::<2ndary NSD server>::: #Data and metadata disk for /my_gpfs, site A, DS4300_1 hdisk2:node2:node3:dataAndMetadata:1: #Data and metadata disk for /my_gpfs, site B, DS4300_2 hdisk3:node3:node2:dataAndMetadata:2: #File system descriptor disk for /my_gpfs, site C, DS4100 hdisk4:::descOnly:3: #Tiebreaker disk, site C, DS4100 hdisk5:::descOnly:-1: Here, our cluster uses 4 disks with GPFS. Filesystem "/my_gpfs" uses hdisk2 and hdisk3 for data and metadata. Therefore these disks will use the NSD mechanism to provide file system data access in case direct disk access fails on one of the cluster nodes. Node2 is the primary NSD server for hdisk2 with node3 being its backup. The same is true for hdisk3, but then the other way around. Each of these disks belongs to a different "failure group" (1=site A, 2=site B) which basically enables replication of file system data and metadata between the two sites. After successfully creating the "disk descriptor file", the following command is used to define the NSDs: # mmcrnsd -F /var/mmfs/conf/diskfile -v yes GPFS assigns a Physical Volume ID PVID to each of the disks. This information is written to sector 2 on the AIX5L hdisk. Since GPFS uses its own PVIDs, do not confuse them with AIX5L PVIDs. After a successful creation of the NSDs, you can verify your setup using the mmlsnsd command: # mmlsnsd -aL File system Disk name NSD Volume ID Primary node Backup node ------------------------------------------------------------------------------- (free disk) gpfs1nsd 099CAF2043A04625 node2 node3 (free disk) gpfs2nsd 099CAF2043A04627 node3 node2 (free disk) gpfs3nsd 099CAF2043A04628 (directly attached) (free disk) gpfs4nsd 099CAF2043A04629 (directly attached) During NSD creation, the diskfile was rewritten. Each hdisk stanza is commented out, and a equivalent NSD stanza is inserted. #::<2ndary NSD server>::: #Data and metadata disk for /my_gpfs, site A, DS4300_1 #hdisk2:node2:node3:dataAndMetadata:1: gpfs1nsd:::dataAndMetadata:1 #Data and metadata disk for /my_gpfs, site B, DS4300_2 #hdisk3:node3:node2:dataAndMetadata:2: gpfs2nsd:::dataAndMetadata:2 #File system descriptor disk for /my_gpfs, site C, DS4100 #hdisk4:::descOnly:3: gpfs3nsd:::descOnly:3 #Tiebreaker disk, site C, DS4100 #hdisk5:::descOnly:-1: gpfs4nsd:::descOnly:-1 ` -- Activating tiebreaker mode -- -------------------------- When using a two node cluster with tiebraker disks, the cluster configuration must be switched to tiebreaker mode. Ofcourse you need to know which disks are being used as tiebreaker disks. Up to 3 disks are allowed. In our example, gpfs4nsd (that is hdisk5) is the only tiebreaker disk. With the following command sequence, tiebreaker mode is turned on: # mmshutdown -a # mmstartup -a A 2 node cluster running in tiebreaker mode can easily be identified by running the following command: # mmgetstate -aL Node number Node name Quorom Nodes up Total nodes GPFS state --------------------------------------------------------------- 1 node2 1* 2 2 active 2 node3 1* 2 2 active If the quorum information is displayed as "1*", this is a 2 node tiebreaker disk cluster. Another nice command to check the status of the cluster is "mmlsconfig". # mmlsconfig Configuration data for cluster TbrCl.node2: ------------------------------------------- ClusterName TbrCl.node2 ClusterId 8262362723390 ClusterType 1c Multinode yes autoload yes useDiskLease yes MaxFeatureLevelAllowed 809 tiebreakerDisks gpfs4nsd -- Creating a GPFS Filesystem -- -------------------------- GPFS generally maintains at least 3 filesystem descriptors, or quorum, per filesystem. Best would be, to have the descriptors distributed over many disks. But you might have only 2 disks, resulting in 2 copies on one disk, and 1 copy on the other disk. That would be an unbalanced situation. GPFS always verifies if more than 50% of the filesystem disks are available, and if not, it will unmount the filesystem. Before we can create the /my_gpfs filesystem we need to prepare a file named "fsdisks_mygpfs" describing all disks belonging to the filesystem. In our example, we use only 2 disks for the filesystem, but we like to have a balanced situation with at least 3 descriptor area's. For this, we can use "#hdisk4:::descOnly:3:" as shown before as an entry in the "nsd diskfile". Our "fdisk_mygpfs" looks like this: #::<2ndary NSD server>::: #Data and metadata disk for /my_gpfs, site A, DS4300_1 gpfs1nsd:::dataAndMetadata:1 #Data and metadata disk for /my_gpfs, site B, DS4300_2 gpfs2nsd:::dataAndMetadata:2 #File system descriptor disk for /my_gpfs, site C, DS4100 gpfs3nsd:::descOnly:3 The next step is to create the file system: # mmcrfs /my_gpfs /dev/my_gpfs -F /var/mmfs/conf/fdisk_mygpfs -A yes -m2 -M2 -r2 -R2 -v yes The mountpoint is /my_gpfs and a device called /dev/my_gpfs is created. The option -F is used to specify a configuration file describing the filesystem's NSDs. We want this filesystem to be mounted automatically during startup (-A yes). When designing our cluster, we decided to use data and metadata replication (-r2,-m2) to provide high availability. If you intend to create several filesystems within your cluster, repeat all the steps as shown above. -- mounting a GPFS Filesystem -- -------------------------- Filesystem "/my_gpfs" will be mounted on each of the cluster nodes using the command: # dsh -a mount -t mmfs The command dsh is the Distributed Shell, wich should be available on your AIX53 systems. Your GPFS filesystem is also registered in /etc/filesystems. Also, standard AIX commands can be used against the GPFS filesystems, like for example: # dsh -w node2,node3 df -k /my_gpfs Filesystem /my_gpfs is now available to both nodes with all three file system descripters being well balanced across failure groups and disks. # mmlsdisk my_gpfs disk driver sector failure holds holds name type size group metadata data status availability disk id remarks ----------------------------------------------------------------------------------------------------- gpfs1nsd nsd 512 1 yes yes ready up 1 desc gpfs2nsd nsd 512 2 yes yes ready up 2 desc gpfs3nsd nsd 512 3 no no ready up 3 desc root@zd111l13.nl.eu.abnamro.com:/data/documentum/dmadmin#mmlsdisk /dev/gpfsfs0 disk driver sector failure holds holds storage name type size group metadata data status availability pool ------------ -------- ------ ------- -------- ----- ------------- ------------ ------------ gpfs3nsd nsd 512 1 yes yes ready up system gpfs4nsd nsd 512 2 yes yes ready up system Notes: ------ Note 1: SDD driver Subsystem Device Driver, SDD, is a pseudo driver designed to support the multipath configuration environments in the IBM Totalstorage Enterprise Storage Server, the IBM TotalStorage DS family, and the IBM System Storage SAN Volume Controller. You can see this driver installed, for example, in HACMP and GPFS systems. At this time, SSD version 1.6.1.0 is not supported by VIOS. Ofcourse, this might change later. Note 2: pv listing: In a gpfs cluster, a lspv might show output like the following example: root@zd110l13:/root# lspv hdisk0 00cb61fe0b562af0 rootvg active hdisk1 00cb61fe0fb40619 rootvg active hdisk2 00cb61fe33429fa6 vge0corddap01 active hdisk3 00cb61fe3342a096 vge0corddap01 active hdisk4 00cb61fe3342a175 gpfs3nsd hdisk5 00cb61fe33536125 gpfs4nsd root@zd110l13:/root# mmlsnsd -aL File system Disk name NSD volume ID Primary node Backup node --------------------------------------------------------------------------------------------- gpfsfs0 gpfs3nsd 0A208FB64650A409 zd110l13 zd110l14.nl.eu.abnamro.com gpfsfs0 gpfs4nsd 0A208FB64650A40D zd110l13 zd110l14.nl.eu.abnamro.com 8.3 GPFS commands: =================== 8.3.1 The mmcrcluster Command: -------------------------------- Name mmcrcluster - Creates a GPFS cluster from a set of nodes. Synopsis mmcrcluster -n NodeFile -p PrimaryServer [-s SecondaryServer] [-r RemoteShellCommand] [-R RemoteFileCopyCommand] [-C ClusterName] [-U DomainName] [-A] [-c ConfigFile] Description Use the mmcrcluster command to create a GPFS cluster. Upon successful completion of the mmcrcluster command, the /var/mmfs/gen/mmsdrfs and the /var/mmfs/gen/mmfsNodeData files are created on each of the nodes in the cluster. Do not delete these files under any circumstances. For further information, see the General Parallel File System: Concepts, Planning, and Installation Guide. You must follow these rules when creating your GPFS cluster: While a node may mount file systems from multiple clusters, the node itself may only be added to a single cluster using the mmcrcluster or mmaddnode command. The nodes must be available for the command to be successful. If any of the nodes listed are not available when the command is issued, a message listing those nodes is displayed. You must correct the problem on each node and issue the mmaddnode command to add those nodes. You must designate at least one node as a quorum node. You are strongly advised to designate the cluster configuration servers as quorum nodes. How many quorum nodes altogether you will have depends on whether you intend to use the node quorum with tiebreaker algorithm. or the regular node based quorum algorithm. For more details, see the General Parallel File System: Concepts, Planning, and Installation Guide and search for designating quorum nodes. Parameters -A Specifies that GPFS daemons are to be automatically started when nodes come up. The default is not to start daemons automatically. -C ClusterName Specifies a name for the cluster. If the user-provided name contains dots, it is assumed to be a fully qualified domain name. Otherwise, to make the cluster name unique, the domain of the primary configuration server will be appended to the user-provided name. If the -C flag is omitted, the cluster name defaults to the name of the primary GPFS cluster configuration server. -c ConfigFile Specifies a file containing GPFS configuration parameters with values different than the documented defaults. A sample file can be found in /usr/lpp/mmfs/samples/mmfs.cfg.sample. See the mmchconfig command for a detailed description of the different configuration parameters. The -c ConfigFile parameter should only be used by experienced administrators. Use this file to only set up parameters that appear in the mmfs.cfg.sample |file. Changes to any other values may be ignored by GFPS. When in doubt, use the mmchconfig command instead. -n NodeFile NodeFile consists of a list of node descriptors, one per line, to be included in the GPFS cluster. Node descriptors are defined as: NodeName:NodeDesignationswhere: NodeName is the hostname or IP address to be used by GPFS for node to node communication. The hostname or IP address must refer to the communications adapter over which the GPFS daemons communicate. Alias interfaces are not allowed. Use the original address or a name that is resolved by the host command to that original address. You may specify a node using any of these forms: Format Example Short hostname k145n01 Long hostname k145n01.kgn.ibm.com IP address 9.119.19.102 NodeDesignations is an optional, '-' separated list of node roles. manager | client - Indicates whether a node is part of the pool of nodes from which configuration and file system managers are selected. The default is client. quorum | nonquorum - Indicates whether a node is to be counted as a quorum node. The default is nonquorum. You must provide a descriptor for each node to be added to the GPFS cluster. -p PrimaryServer Specifies the primary GPFS cluster configuration server node used to store the GPFS configuration data. This node must be a member of the GPFS cluster. -R RemoteFileCopy Specifies the fully-qualified path name for the remote file copy program to be used by GPFS. The default value is /usr/bin/rcp. The remote copy command must adhere to the same syntax format as the rcp command, but may implement an alternate authentication mechanism. -r RemoteShellCommand Specifies the fully-qualified path name for the remote shell program to be used by GPFS. The default value is /usr/bin/rsh. The remote shell command must adhere to the same syntax format as the rsh command, but may implement an alternate authentication mechanism. -s SecondaryServer Specifies the secondary GPFS cluster configuration server node used to store the GPFS cluster data. This node must be a member of the GPFS cluster. It is suggested that you specify a secondary GPFS cluster configuration server to prevent the loss of configuration data in the event your primary GPFS cluster configuration server goes down. When the GPFS daemon starts up, at least one of the two GPFS cluster configuration servers must be accessible. If your primary GPFS cluster configuration server fails and you have not designated a secondary server, the GPFS cluster configuration files are inaccessible, and any GPFS administrative commands that are issued fail. File system mounts or daemon startups also fail if no GPFS cluster configuration server is available. -U DomainName Specifies the UID domain name for the cluster. A detailed description of the GPFS user ID remapping convention is contained in UID Mapping for GPFS In a Multi-Cluster Environment at www.ibm.com/servers/eserver/clusters/library/wp_aix_lit.html. Exit status 0 Successful completion. 1 A failure has occurred. Security You must have root authority to run the mmcrcluster command. You may issue the mmcrcluster command from any node in the GPFS cluster. A properly configured .rhosts file must exist in the root user's home directory on each node in the GPFS cluster. If you have designated the use of a different remote communication program on either the mmcrcluster or the mmchcluster command, you must ensure: Proper authorization is granted to all nodes in the GPFS cluster. The nodes in the GPFS cluster can communicate without the use of a password, and without any extraneous messages. Example 1: ---------- To create a GPFS cluster made of all of the nodes listed in the file /u/admin/nodelist, using node k164n05 as the primary server, and node k164n04 as the secondary server, issue: # mmcrcluster -n /u/admin/nodelist -p k164n05 -s k164n04 where /u/admin/nodelist has the these contents: k164n04.kgn.ibm.com:quorum k164n05.kgn.ibm.com:quorum k164n06.kgn.ibm.com The output of the command is similar to: Mon Aug 9 22:14:34 EDT 2004: 6027-1664 mmcrcluster: Processing node k164n04.kgn.ibm.com Mon Aug 9 22:14:38 EDT 2004: 6027-1664 mmcrcluster: Processing node k164n05.kgn.ibm.com Mon Aug 9 22:14:42 EDT 2004: 6027-1664 mmcrcluster: Processing node k164n06.kgn.ibm.com mmcrcluster: Command successfully completed mmcrcluster: 6027-1371 Propagating the changes to all affected. nodes. This is an asynchronous process. To confirm the creation, enter: # mmlscluster The system displays information similar to: GPFS cluster information ======================== GPFS cluster name: k164n05.kgn.ibm.com GPFS cluster id: 680681562214606028 GPFS UID domain: k164n05.kgn.ibm.com Remote shell command: /usr/bin/rsh Remote file copy command: /usr/bin/rcp GPFS cluster configuration servers: ------------------------------------- Primary server: k164n05.kgn.ibm.com Secondary server: k164n04.kgn.ibm.com Node number Node name IP address Full node name Remarks -------------------------------------------------------------------------- 1 k164n04 198.117.68.68 k164n04.kgn.ibm.com quorum node 2 k164n05 198.117.68.69 k164n05.kgn.ibm.com quorum node 3 k164n06 198.117.68.70 k164n06.kgn.ibm.com Example 2: ---------- # mmcrcluster -n /home/root/nodelist -p zcnodeb -s n5nodea -r /usr/bin/rsh -R /usr/bin/rcp -C MDLPR -A Where the -C option determines the clustername. You can start the cluster (GPFS daemon) by using # mmstartup -a Check if all nodes are registered in the cluster # mmlscluster 8.3.2 Other GPFS commands: --------------------------- The most common gpfs commands, will be illustrated by examples. -- List cluster info: mmlscluster -- ------------------------------ # mmlscluster The system displays information similar to: GPFS cluster information ======================== GPFS cluster name: k164n05.kgn.ibm.com GPFS cluster id: 680681562214606028 GPFS UID domain: k164n05.kgn.ibm.com Remote shell command: /usr/bin/rsh Remote file copy command: /usr/bin/rcp GPFS cluster configuration servers: ------------------------------------- Primary server: k164n05.kgn.ibm.com Secondary server: k164n04.kgn.ibm.com Node number Node name IP address Full node name Remarks -------------------------------------------------------------------------- 1 k164n04 198.117.68.68 k164n04.kgn.ibm.com quorum node 2 k164n05 198.117.68.69 k164n05.kgn.ibm.com quorum node 3 k164n06 198.117.68.70 k164n06.kgn.ibm.com -- Retrieving the Cluster status: -- ------------------------------ # mmgetstate -aL Node number Node name Quorom Nodes up Total nodes GPFS state ------------------------------------------------------------- 1 node2 2 2 2 active 2 node3 2 2 2 active -- Retreiving config data of the Cluster: -- -------------------------------------- # mmlsconfig Configuration data for cluster TbrCl.node2: ------------------------------------------- ClusterName TbrCl.node2 ClusterId 8262362723390 ClusterType 1c Multinode yes autoload yes useDiskLease yes MaxFeatureLevelAllowed 809 tiebreakerDisks gpfs4nsd root@zd110l13:/root#mmlsconfig Configuration data for cluster cluster_name.zd110l13: ----------------------------------------------------- clusterName cluster_name.zd110l13 clusterId 729741152660153204 clusterType lc autoload no useDiskLease yes maxFeatureLevelAllowed 912 tiebreakerDisks gpfs3nsd;gpfs4nsd [zd110l13] takeOverSdrServ yes File systems in cluster cluster_name.zd110l13: ---------------------------------------------- /dev/gpfsfs0 root@zd110l13:/var/adm/ras#df -k | grep /dev/gpfsfs0 /dev/gpfsfs0 2097152000 2009668608 5% 101193 5% /data/documentum/dmadmin -- Change the status of a disk, and listing status: mmchdisk and mmlsdisk -- ---------------------------------------------------------------------- You can even simulate the loss of a NSD disk from a Cluster, for example # mmchdisk my_gpfs stop -d "gpfs1nsd" # mmlsdisk my_gpfs -L disk driver sector failure holds holds name type size group metadata data status availability disk id remarks ----------------------------------------------------------------------------------------------------- gpfs1nsd nsd 512 1 yes yes ready down 1 desc gpfs2nsd nsd 512 2 yes yes ready up 2 desc gpfs3nsd nsd 512 3 no no ready up 3 desc We have used the example of the 2 node cluster of section 74.1 here. Since the quorom is still met, even with one disk "down", the service is still working. -- Changes GPFS cluster configuration data. -- ---------------------------------------- The mmchcluster command serves different purposes: Change the primary or secondary GPFS cluster data server. Synchronize the primary GPFS cluster data server. Change the remote shell and remote file copy programs to be used by the nodes in the cluster. To change the primary GPFS server for the cluster, enter: # mmchcluster -p k145n03 -- Changes the attributes of a GPFS file system -- -------------------------------------------- Use the mmchfs command to change the attributes of a GPFS file system. To change the default replicas for metadata to 2 and the default replicas for data to 2 for new files created in the fs0 file system, enter: # mmchfs fs0 -m 2 -r 2 To confirm the change, enter: # mmlsfs fs0 -m -r The system displays information similar to: flag value description ---- -------------- ----------------------------------- -m 2 Default number of metadata replicas -r 2 Default number of data replicas More examples: -- Add a node to the cluster -- ------------------------- The mmaddnode command adds nodes to a GPFS cluster. Use the mmaddnode command to add nodes to an existing GPFS cluster. On each new node a mount point directory and character mode device is created for each GPFS filesystem. Example: To add the nodes "k164n06" and "k164n07" as quorom nodes, designating "k164n06" to be available as manager node, use the following command: # mmaddnode -N k164n06:quorom-manager,k164n07:quorom -- Mounting and unmounting GPFS file -- ---------------------------------- Use the mmmount and mmumount to mount or unmount GPFS filesystem on one or more nodes in the cluster. Examples: - To mount all GPFS filesystems on all of the nodes in the cluster: # mmmount all -a - To mount filesystem "fs2" read-only on the local node, use # mmmount fs2 -o ro - To mount fs1 on all NSD server nodes, use # mmmount fs1 -N nsdnodes - To unmount fs1 on all nodes of the cluster, use # mmumount fs1 -a -- Creates cluster-wide names for Network Shared Disks (NSDs) used by GPFS -- ----------------------------------------------------------------------- mmcrnsd -F DescFile [-v {yes |no}] The mmcrnsd command is used to create cluster-wide names for NSDs used by GPFS. This is the first GPFS step in preparing a disk for use by a GPFS file system. A disk descriptor file supplied to this command is rewritten with the new NSD names and that rewritten disk descriptor file can then be supplied as input to the mmcrfs command. The name created by the mmcrnsd command is necessary since disks connected at multiple nodes may have differing disk device names in /dev on each node. The name uniquely identifies the disk. This command must be run for all disks that are to be used in GPFS file systems. The mmcrnsd command is also used to assign a primary and backup NSD server that can be used for I/O operations on behalf of nodes that do not have direct access to the disk. To identify that the disk has been processed by the mmcrnsd command, a unique NSD volume ID is written on sector 2 of the disk. All of the NSD commands (mmcrnsd, mmlsnsd, and mmdelnsd) use this unique NSD volume ID to identify and process NSDs. After the NSDs are created, the GPFS cluster data is updated and they are available for use by GPFS. Examples: To create your NSDs from the descriptor file nsdesc containing: sdav1:k145n05:k145n06:dataOnly:4 sdav2:k145n04::dataAndMetadata:5:ABC enter: # mmcrnsd -F nsdesc 8.4 Installing GPFS: ===================== Installing GPFS V. 2.3 or v. 3.1 Installing GPFS on AIX 5L nodes It is suggested you read Planning for GPFS and the GPFS FAQs at publib.boulder.ibm.com/infocenter/clresctr/topic/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfsclustersfaq.html. Do not attempt to install GPFS if you do not have the prerequisites listed in Hardware requirements and Software requirements. Ensure that the PATH environment variable on each node includes /usr/lpp/mmfs/bin. The installation process includes: -Files to ease the installation process -Verifying the level of prerequisite software -Installation procedures >> Files to ease the installation process Creation of a file that contains all of the nodes in your GPFS cluster prior to the installation of GPFS, will be useful during the installation process. Using either host names or IP addresses when constructing the file will allow you to use this information when creating your cluster through the mmcrcluster command. For example, create the file /tmp/gpfs.allnodes, listing the nodes one per line: k145n01.dpd.ibm.com k145n02.dpd.ibm.com k145n03.dpd.ibm.com k145n04.dpd.ibm.com k145n05.dpd.ibm.com k145n06.dpd.ibm.com k145n07.dpd.ibm.com k145n08.dpd.ibm.com >> Verifying the level of prerequisite software It is necessary to verify you have the correct levels of the prerequisite software installed. If the correct level of prerequisite software is not installed, see the appropriate installation manual before proceeding with your GPFS installation: 1. AIX 5L Version 5 Release 2 with the latest level of service available # WCOLL=/tmp/gpfs.allnodes dsh "oslevel" Output similar to this should be displayed: 5.2.0.10 2. AIX 5L Version 5 Release 3 with the latest level of service available # WCOLL=/tmp/gpfs.allnodes dsh "oslevel" Output similar to this should be displayed: 5.3.0.0 If you are utilizing NFS V4, at a minimum your output should include: 5.3.0.10 >>Installation procedures The installation procedures are generalized for all levels of GPFS. Ensure you substitute the correct numeric value for the modification (m) and fix (f) levels, where applicable. The modification and fix level are dependent upon the level of PTF support. Follow these steps to install the GPFS software using the installp command: 1. Electronic license agreement 2. Creating the GPFS directory 3. Creating the GPFS installation table of contents file 4. Installing the GPFS man pages 5. Installing GPFS on your network 6. Existing GPFS files 7. Verifying the GPFS installation --1. Electronic license agreement The GPFS software license agreements is shipped and viewable electronically. The electronic license agreement must be accepted before software installation can continue. For additional software package installations, the installation cannot occur unless the appropriate license agreements are accepted. When using the installp command, use the -Y flag to accept licenses and the -E flag to view license agreement files on the media. --2. Creating the GPFS directory To create the GPFS directory: On any node create a temporary subdirectory where GPFS installation images will be extracted. For example: # mkdir /tmp/gpfslpp Copy the installation images from the CD-ROM to the new directory, by issuing: # bffcreate -qvX -t /tmp/gpfslpp -d /dev/cd0 all This will place the following GPFS images in the image directory : gpfs.base gpfs.docs gpfs.msg.en_US --3. Creating the GPFS installation table of contents file Make the new image directory the current directory: # cd /tmp/gpfslpp Use the inutoc command to create a .toc file. The .toc file is used by the installp command. # inutoc . --4. Installing the GPFS man pages In order to use the GPFS man pages you must install the gpfs.docs image. The GPFS manual pages will be located at /usr/share/man/. Installation consideration: The gpfs.docs image need not be installed on all nodes if man pages are not desired or local file system space on the node is minimal. --5. Installing GPFS on your network Install GPFS according to these directions, where localNode is the name of the node on which you are running: If you are installing on a shared file system network, ensure the directory where the GPFS images can be found is NFS exported to all of the nodes planned for your GPFS cluster (/tmp/gpfs.allnodes). Ensure an acceptable directory or mountpoint is available on each target node, such as /tmp/gpfslpp. If there is not, create one: # WCOLL=/tmp/gpfs.allnodes dsh "mkdir /tmp/gpfslpp" If you are installing on a shared file system network, to place the GPFS images on each node in your network, issue: # WCOLL=/tmp/gpfs.allnodes dsh "mount localNode:/tmp/gpfslpp /tmp/gpfslpp" Otherwise, issue: # WCOLL=/tmp/gpfs.allnodes dsh "rcp localNode:/tmp/gpfslpp/gpfs* /tmp/gpfslpp" # WCOLL=/tmp/gpfs.allnodes dsh "rcp localNode:/tmp/gpfslpp/.toc /tmp/gpfslpp" Install GPFS on each node: # WCOLL=/tmp/gpfs.allnodes dsh "installp -agXYd /tmp/gpfslpp gpfs" --6. Existing GPFS files If you have previously installed GPFS on your system, during the install process you may see messages similar to: Some configuration files could not be automatically merged into the system during the installation. The previous versions of these files have been saved in a configuration directory as listed below. Compare the saved files and the newly installed files to determine if you need to recover configuration data. Consult product documentation to determine how to merge the data. Configuration files which were saved in /lpp/save.config: /var/mmfs/etc/gpfsready /var/mmfs/etc/gpfsrecover.src /var/mmfs/etc/mmfsdown.scr /var/mmfs/etc/mmfsup.scr If you have made changes to any of these files, you will have to reconcile the differences with the new versions of the files in directory /var/mmfs/etc. This does not apply to file /var/mmfs/etc/mmfs.cfg which is automatically maintained by GPFS. --7. Verifying the GPFS installation Use the lslpp command to verify the installation of GPFS file sets on each node: lslpp -l gpfs\* Output similar to the following should be returned: Fileset Level State Description ---------------------------------------------------------------------------- Path: /usr/lib/objrepos gpfs.base 2.3.0.0 COMMITTED GPFS File Manager gpfs.docs.data 2.3.0.0 COMMITTED GPFS Server Manpages gpfs.msg.en_US 2.3.0.0 COMMITTED GPFS Server Messages - U.S. English Path: /etc/objrepos gpfs.base 2.3.0.0 COMMITTED GPFS File Manager Example: root@zd110l14:/root#lslpp -L "*gpfs*" Fileset Level State Type Description (Uninstaller) ---------------------------------------------------------------------------- gpfs.base 3.1.0.11 C F GPFS File Manager gpfs.docs.data 3.1.0.4 C F GPFS Server Manpages and Documentation gpfs.msg.en_US 3.1.0.10 C F GPFS Server Messages - U.S. English State codes: A -- Applied. B -- Broken. C -- Committed. E -- EFIX Locked. O -- Obsolete. (partially migrated to newer version) ? -- Inconsistent State...Run lppchk -v. Type codes: F -- Installp Fileset P -- Product C -- Component T -- Feature R -- RPM Package E -- Interim Fix root@zd110l14:/root#lslpp -l gpfs\* Fileset Level State Description ---------------------------------------------------------------------------- Path: /usr/lib/objrepos gpfs.base 3.1.0.11 COMMITTED GPFS File Manager gpfs.msg.en_US 3.1.0.10 COMMITTED GPFS Server Messages - U.S. English Path: /etc/objrepos gpfs.base 3.1.0.11 COMMITTED GPFS File Manager Path: /usr/share/lib/objrepos gpfs.docs.data 3.1.0.4 COMMITTED GPFS Server Manpages and Documentation 8.5 GPFS error messages: ========================= The MMFS log GPFS writes both operational messages and error data to the MMFS log file. The MMFS log can be found in the /var/adm/ras directory on each node. The MMFS log file is named mmfs.log.date.nodeName, where date is the time stamp when the instance of GPFS started on the node and nodeName is the name of the node. The latest mmfs log file can be found by using the symbolic file name /var/adm/ras/mmfs.log.latest. The MMFS log from the previous instance of GPFS can be found by using the symbolic file name /var/adm/ras/mmfs.log.previous. All other files have a timestamp and node name appended to the file name. Example: root@zd110l13:/var/adm/ras#cat mmfs.log.latest Sun May 20 22:10:37 DFT 2007 runmmfs starting Removing old /var/adm/ras/mmfs.log.* files: Loading kernel extension from /usr/lpp/mmfs/bin . . . GPFS: 6027-500 /usr/lpp/mmfs/bin/aix64/mmfs64 loaded and configured. Sun May 20 22:10:39 2007: GPFS: 6027-310 mmfsd64 initializing. {Version: 3.1.0.11 Built: Apr 6 2007 09:38:56} ... Sun May 20 22:10:44 2007: GPFS: 6027-1710 Connecting to 10.32.143.184 zd110l14.nl.eu.abnamro.com Sun May 20 22:10:44 2007: GPFS: 6027-1711 Connected to 10.32.143.184 zd110l14.nl.eu.abnamro.com Sun May 20 22:10:44 2007: GPFS: 6027-300 mmfsd ready Sun May 20 22:10:44 DFT 2007: mmcommon mmfsup invoked Sun May 20 22:10:44 DFT 2007: mounting /dev/gpfsfs0 Sun May 20 22:10:44 2007: Command: mount gpfsfs0 323816 Sun May 20 22:10:46 2007: Command: err 0: mount gpfsfs0 323816 Sun May 20 22:10:46 DFT 2007: finished mounting /dev/gpfsfs0 At GPFS startup, files that have not been accessed during the last ten days are deleted. If you want to save old files, copy them elsewhere. This example shows normal operational messages that appear in the MMFS log file: Tue Aug 31 16:02:43 edt 2004 runmmfs starting Removing old /var/adm/ras/mmfs.log.* files: mv: 0653-401 Cannot rename /var/adm/ras/mmfs.log.previous to /var/adm/ras/mmfs.log.previous.save: A file or directory in the path name does not exist. Loading kernel extension from /usr/lpp/mmfs/bin . . . /usr/lpp/mmfs/bin/vcmdummy64 loaded and configured /usr/lpp/mmfs/bin/aix64/mmfs64 loaded and configured. Tue Aug 31 16:02:44 2004: GPFS: 6027-310 mmfsd64 initializing. {Version: 3.7.0.0 Built: Aug 30 2004 17:10:20} ... Tue Aug 31 16:02:54 2004: GPFS: 6027-1710 Connecting to 198.16.0.9 k154gn09 Tue Aug 31 16:02:55 2004: GPFS: 6027-1711 Connected to 198.16.0.9 k154gn09 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.2 k154gn02 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.18 k155gn02 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.49 kolt1g_r1b32 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.17 k155gn01 Tue Aug 31 16:02:55 2004: GPFS: 6027-1710 Connecting to 198.16.0.10 k154gn10 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.35 Tue Aug 31 16:02:55 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.5 Tue Aug 31 16:02:57 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.23 Tue Aug 31 16:02:57 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.6 Tue Aug 31 16:02:57 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.21 Tue Aug 31 16:03:00 edt 2004 /var/mmfs/etc/gpfsready invoked Tue Aug 31 16:03:00 2004: GPFS: 6027-300 mmfsd ready Tue Aug 31 16:03:00 2004: GPFS: 6027-1709 Accepted and connected to 198.16.0.10 k154gn10 Tue Aug 31 16:03:00 edt 2004: mounting /dev/fs3 Tue Aug 31 16:03:00 2004: Command: mount fs3 594128 Depending on the size and complexity of your system configuration, the amount of time to start GPFS varies. Taking your system configuration into consideration, after a reasonable amount of time if you cannot access the file system look in the log file for error messages. The GPFS log is a repository of error conditions that have been detected on each node, as well as operational events such as file system mounts. The GPFS log is the first place to look when attempting to debug abnormal events. Since GPFS is a cluster file system, events that occur on one node may affect system behavior on other nodes, and all GPFS logs may have relevant data. GPFS for AIX 5L V2.2 in an HACMP Cluster Problem Determination Guide The operating system error log facility GPFS records file system or disk failures using the error logging facility provided by the operating system: syslog facility on Linux and errpt facility on AIX. For the remainder of this book, the error logging facility will be referred to as 'the error log'. These failures can be viewed by issuing this command: errpt -a The error log contains information about several classes of events or errors. These classes are: MMFS_ABNORMAL_SHUTDOWN MMFS_DISKFAIL MMFS_ENVIRON MMFS_FSSTRUCT MMFS_GENERIC MMFS_LONGDISKIO MMFS_PHOENIX MMFS_QUOTA MMFS_SYSTEM_UNMOUNT MMFS_SYSTEM_WARNING MMFS_ABNORMAL_SHUTDOWN The MMFS_ABNORMAL_SHUTDOWN error log entry means that GPFS has determined that it must shutdown all operations on this node because of a problem. This is most likely caused by some interaction with the Group Services component. Group services failures may result in abnormal shutdown, as well as possible loss of quorum. Insufficient memory on the node to handle critical recovery situations can also cause this error. In general there will be other error log entries from GPFS or some other component associated with this error log entry. MMFS_DISKFAIL The MMFS_DISKFAIL error log entry indicates that GPFS has detected the failure of a disk and forced the disk to the stopped state. Unable to access disks describes the actions taken in response to this error. This is ordinarily not a GPFS error but a failure in the disk subsystem or the path to the disk subsystem. the book AIX 5L System Management Guide: Operating System and Devices and search on logical volume. Follow the problem determination and repair actions specified. MMFS_ENVIRON MMFS_ENVIRON error log entry records are associated with other records of the MMFS_GENERIC or MMFS_SYSTEM_UNMOUNT types. They indicate that the root cause of the error is external to GPFS and usually in the network that supports GPFS. Check the network and its physical connections. The data portion of this record supplies the return code provided by the communications code. MMFS_FSSTRUCT The MMFS_FSSTRUCT error log entry indicates that GPFS has detected a problem with the on-disk structure of the file system. The severity of these errors depends on the exact nature of the inconsistent data structure. If it is limited to a single file, EIO errors will be reported to the application and operation will continue. If the inconsistency affects vital metadata structures, operation will cease on this file system. These errors are often associated with an MMFS_SYSTEM_UNMOUNT error log entry and will probably occur on all nodes. If the error occurs on all nodes, some critical piece of the file system is inconsistent. This may occur as a result of a GPFS error or an error in the disk system. Issuing the mmfsck command may repair the error: Issue the mmfsck -n command to collect data. Issue the mmfsck -y command off-line to repair the file system. If the file system is not repaired after issuing the mmfsck command, contact the IBM Support Center. MMFS_GENERIC The MMFS_GENERIC error log entry means that GPFS self diagnostics have detected an internal error, or that additional information is being provided with an MMFS_SYSTEM_UNMOUNT report. If the record is associated with an MMFS_SYSTEM_UNMOUNT report, the event code fields in the records will be the same. The error code and return code fields may describe the error. See Messages for a listing of codes generated by GPFS. If the error is generated by the self diagnostic routines, service personnel should interpret the return and error code fields since the use of these fields varies by the specific error. Errors caused by the self checking logic will result in the shutdown of GPFS on this node. MMFS_GENERIC errors may result from an inability to reach a critical disk resource. These errors may look different depending on the specific disk resource that has become unavailable, like logs and allocation maps. This type of error will usually be associated with other error indications. Other errors generated by disk subsystems, high availability components, and communications components at the same time as, or immediately preceding, the GPFS error should be pursued first because they may be the cause of these errors. MMFS_GENERIC error indications without an associated error of those types represent a GPFS problem that requires the IBM Support Center. See Information to collect before contacting the IBM Support Center. MMFS_LONGDISKIO The MMFS_LONGDISKIO error log entry indicates that GPFS is experiencing very long response time for disk requests. This is a warning message and may indicate that your disk system is overloaded or that a failing disk is requiring many I/O retries. Follow your operating system's instructions for monitoring the performance of your I/O subsystem on this node. The data portion of this error record specifies the disk involved. There may be related error log entries from the disk subsystems that will pinpoint the actual cause of the problem. See the book AIX 5L Performance Management Guide. MMFS_PHOENIX MMFS_PHOENIX error log entries reflect a failure in GPFS interaction with Group Services. Go to the book Reliable Scalable Cluster Technology: Administration Guide. Search for diagnosing group services problems. Follow the problem determination and repair action specified. These errors are usually not GPFS problems, although they will disrupt GPFS operation. MMFS_QUOTA The MMFS_QUOTA error log entry is used when GPFS detects a problem in the handling of quota information. This entry is created when the quota manager has a problem reading or writing the quota file. If the quota manager cannot read all entries in the quota file when mounting a file system with quotas enabled, the quota manager shuts down, but file system manager initialization continues. Client mounts will not succeed and will return an appropriate error message. In order for GPFS quota accounting to work properly, the system administrator should ensure that the user and group information is consistent throughout the nodeset, such as the /etc/passwd and /etc/group files are identical across the nodeset. Otherwise, unpredictable and erroneous quota accounting will occur. It may be necessary to run an off-line quota check (mmcheckquota) to repair or recreate the quota file. If the quota file is corrupted, mmcheckquota will not restore it. The file must be restored from the backup copy. If there is no backup copy, an empty file may be set as the new quota file. This is equivalent to recreating the quota file. To set an empty file or use the backup file, issue the mmcheckquota command with the appropriate operand: -u UserQuotaFilename for the user quota file -g GroupQuotaFilename for the group quota file Reissue the mmcheckquota command to check the file system inode and space usage. MMFS_SYSTEM_UNMOUNT The MMFS_SYSTEM_UNMOUNT error log entry means that GPFS has discovered a condition which may result in data corruption if operation with this file system continues from this node. GPFS has marked the file system as disconnected and applications accessing files within the file system will receive ESTALE errors. This may be the result of: The loss of a path to all disks containing a critical data structure. An internal processing error within the file system. See File system forced unmount. Follow the problem determination and repair actions specified. MMFS_SYSTEM_WARNING The MMFS_SYSTEM_WARNING error log entry means that GPFS has detected a system level value approaching its maximum limit. This may occur as a result of the number of inodes (files) reaching its limit. Issue the mmchfs command to increase the number of inodes for the file system so there is at least a minimum of 5% free. Error log entry example This is an example of an error log entry which indicates loss of the Group Services subsystem: LABEL: MMFS_ABNORMAL_SHUTD IDENTIFIER: 1FB9260D Date/Time: Thu May 16 14:39:07 EDT Sequence Number: 759 Machine Id: 000196364C00 Node Id: k145n01 Class: S Type: PERM Resource Name: mmfs Description SOFTWARE PROGRAM ABNORMALLY TERMINATED Probable Causes SOFTWARE PROGRAM Failure Causes SOFTWARE PROGRAM Recommended Actions CONTACT APPROPRIATE SERVICE REPRESENTATIVE Detail Data COMPONENT ID 595B9500 PROGRAM mmfsd64 DETECTING MODULE /fs/mmfs/ts/phoenix/PhoenixInt.C MAINTENANCE LEVEL 2.2.0.0 LINE 4409 RETURN CODE 668 REASON CODE 0000 0000 EVENT CODE 0 =============================== 9. HACMP =============================== Section 9: HACMP 9.1: Overview Cluster solutions and terminology on AIX: ======================================================== -- CSM: (Management of Cluster) -- ---------------------------- What is Cluster Systems Management (CSM)? Cluster Systems Management (CSM) software provides a distributed system management solution that allows a system administrator to set up and maintain a cluster of nodes that run the AIX® or Linux® operating system. CSM simplifies cluster administration tasks by providing management from a single point-of-control. CSM can be used to manage homogeneous clusters of servers that run Linux, homogeneous servers that run AIX, or mixed clusters which include both AIX and Linux. You can use the following hardware for your CSM management server, install server, and nodes: IBM System x: System x, IBM xSeries®, IBM BladeCenter®*, and IBM eServer 325, |326, and 326m hardware | IBM System p: System p, IBM pSeries, IBM BladeCenter*, System p5, IBM eServer OpenPower *The BladeCenter JS models use the POWER architecture common to all System p servers. The management server is the machine that is designated to operate, monitor, and maintain the rest of the cluster. Install servers are the machines that are used to install the nodes. By default, the management server is the install server. Managed nodes are instances of the operating system that you can manage in the cluster. Managed devices are the non-node devices for which CSM supports power control and remote console access. For hardware and software support information, see Planning for CSM software. Communicating with CSM: CSM offers you several options for issuing commands to the cluster: -Command line interface -Distributed Command Execution Manager (DCEM) -IBM® Web-based System Manager -SMIT -- GPFS: -- ----- Introducing General Parallel File System GPFS is a high-performance cluster file system for AIX 5L, Linux and mixed clusters that provides users with shared access to files spanning multiple disk drives. By dividing individual files into blocks and reading/writing these blocks in parallel across multiple disks, GPFS provides very high bandwidth; in fact, GPFS has won awards and set world records for performance. In addition, GPFS's multiple data paths can also eliminate single points of failure, making GPFS extremely reliable. GPFS currently powers many of the world’s largest scientific supercomputers and is increasingly used in commercial applications requiring high-speed access to large volumes of data such as digital media, engineering design, business intelligence, financial analysis and geographic information systems. GPFS is based on a shared disk model, providing lower overhead access to disks not directly attached to the application nodes, and using a distributed protocol to provide data coherence for access from any node. IBM's General Parallel File System (GPFS) provides file system services to parallel and serial applications. GPFS allows parallel applications simultaneous access to the same files, or different files, from any node which has the GPFS file system mounted while managing a high level of control over all file system operations. GPFS is particularly appropriate in an environment where the aggregate peak need for data bandwidth exceeds the capability of a distributed file system server. GPFS allows users shared file access within a single GPFS cluster and across multiple GPFS clusters. A GPFS cluster consists of: AIX 5L™ nodes, Linux® nodes, or a combination thereof (see GPFS cluster configurations). A node may be: An individual operating system image on a single computer within a cluster. A system partition containing an operating system. Some System p5™ and pSeries® machines allow multiple system partitions, each of which is considered to be a node within the GPFS cluster. Network shared disks (NSDs) created and maintained by the NSD component of GPFS All disks utilized by GPFS must first be given a globally accessible NSD name. The GPFS NSD component provides a method for cluster-wide disk naming and access. On Linux machines running GPFS, you may give an NSD name to: Physical disks Logical partitions of a disk Representations of physical disks (such as LUNs) On AIX® machines running GPFS, you may give an NSD name to: Physical disks Virtual shared disks Representations of physical disks (such as LUNs) A shared network for GPFS communications allowing a single network view of the configuration. A single network, a LAN or a switch, is used for GPFS communication, including the NSD communication. -- PSSP: (predecessor to Cluster Systems Management (CSM)) -- ------------------------------------------------------- Parallel System Support Programs (PSSP) The PSSP 3.5 software is a comprehensive suite of applications to manage a system as a full-function parallel processing system. It provides administrative tasks that help increase productivity by enabling administrators to view, monitor, and operate the system from the control workstation, a single point of control. The PSSP software is discussed in terms of functional entities called components of PSSP. Most functions are base components of PSSP while others are optional; they come with the PSSP software, but you can choose whether to install and use them. With PSSP 3.5, AIX 5L 5.1 or 5.2 must be on the control workstation. Note that your control workstation must be at the highest AIX level in the system. If you have any HMC-controlled servers in your system, AIX 5L 5.1 or 5.2 must be on each HMC-controlled server node. Other nodes can have AIX 5L 5.1 and PSSP 3.4, or AIX 4.3.3 with PSSP 3.4 or PSSP 3.2. However, you can only run with the 64-bit AIX kernel and switch between 64-bit and 32-bit AIX kernel mode on nodes with PSSP 3.5. Parallel System Support Programs (PSSP) for AIX® PSSP is the systems management predecessor to Cluster Systems Management (CSM) and does not support IBM System p servers or AIX 5L V5.3. New cluster deployments should use CSM and existing PSSP customers with software maintenance will be transitioned to CSM at no charge. -- Tivoli Workload Scheduler LoadLeveler -- ------------------------------------- Used for dynamic workload scheduling, Tivoli Workload Scheduler LoadLeveler is a distributed network-wide job management facility designed to dynamically schedule work such as maximize resource utilization and minimize job completion time. Jobs are scheduled based on job priority, job requirements, resource availability and user-defined rules to match processing needs with resources. LoadLeveler provides consolidated accounting and reporting and supports IBM servers including IBM System p and System x environments. -- Engineering Scientific Subroutine Library (ESSL) and Parallel ESSL -- ------------------------------------------------------------------ ESSL is a collection of state–of–the–art mathematical subroutines specifically tuned to IBM hardware and offering significant performance improvement to any math–intensive scientific or engineering applications. Parallel ESSL extends the function of ESSL to support parallel applications that use the Message Passing Interface included in IBM Parallel Environment. ESSL and Parallel ESSL support C, C++ and Fortran applications. -- Parallel Environment (PE) -- ------------------------- Parallel Environment for AIX 5L is a comprehensive development and execution environment for parallel applications (distributed-memory, message-passing applications running across multiple nodes). It is designed to help organizations develop, test, debug, tune and run high-performance parallel applications in C, C++ and Fortran on IBM System p and System x clusters. Parallel Environment runs on AIX 5L V5.2 and V5.3. -- HACMP: -- ------ HACMP is designed to provide high availability for critical business applications and data through system redundancy and failover. HACMP constantly monitors the status of servers, networks and applications to detect failures or performance degradation and can respond by automatically restarting a troubled application on designated backup hardware, taking care of all network or storage connections in the process. With HACMP, clients can scale up to 32 nodes and mix and match system sizes and performance levels as well as network adapters and disk subsystems to satisfy specific application, network and disk performance needs. HACMP/XD extends HACMP’s high availability capabilities across geographic sites with remote data mirroring (replication) and failover using this mirrored data; this combination can maintain application and data availability even if an entire site is disabled by a disaster. HACMP/XD provides IP-based data mirroring and also supports hardware-based mirroring products such as IBM Enterprise Storage Systems Metro-Mirror (formerly PPRC). -- RSCT: -- ----- Reliable Scalable Cluster Technology. Since HACMP 5.1, HACMP relies on RSCT. So, in modern HACMP, RSCT is a neccessary component or subsystem. For example, HACMP uses the heartbeat facility of RSCT. RSCT is a standard component in AIX5L. Reliable Scalable Cluster Technology, or RSCT, is a set of software components that together provide a comprehensive clustering environment for AIX® and Linux®. RSCT is the infrastructure used by a variety of IBM® products to provide clusters with improved system availability, scalability, and ease of use. RSCT includes the following components: - Resource Monitoring and Control (RMC) subsystem. This is the scalable, reliable backbone of RSCT. It runs on a single machine or on each node (operating system image) of a cluster and provides a common abstraction for the resources of the individual system or the cluster of nodes. You can use RMC for single system monitoring or for monitoring nodes in a cluster. In a cluster, however, RMC provides global access to subsystems and resources throughout the cluster, thus providing a single monitoring and management infrastructure for clusters. - RSCT core resource managers. A resource manager is a software layer between a resource (a hardware or software entity that provides services to some other component) and RMC. A resource manager maps programmatic abstractions in RMC into the actual calls and commands of a resource. - RSCT cluster security services, which provide the security infrastructure that enables RSCT components to authenticate the identity of other parties. - Topology Services subsystem, which, on some cluster configurations, provides node and network failure detection. Group Services subsystem, which, on some cluster configurations, provides cross-node/process coordination. RSCT is the “glue” that holds the nodes together in a cluster. It is a group of low-level components that allow clustering technologies, such as High-Availability Cluster Multiprocessing (HACMP) and General Parallel File System (GPFS), to be built easily. RSCT technology was originally developed by IBM for RS/6000 SP systems (Scalable POWERparallel). As time passed, it became apparent that these capabilities could be used on a growing number of general computing applications, so they were moved into components closer to the operating system (OS), such as Resource Monitoring and Control (RMC), Group Services, and Topology Services. The components were originally packaged as part of the RS/6000 SP Parallel System Support Program (PSSP) and called RSCT. RSCT is now packaged as part of AIX 5L Version 5.1 and later. RSCT is also included in Cluster Systems Management (CSM) for Linux. Now, Linux nodes (with appropriate hardware and software levels) running CSM 1.3 for Linux can be part of the management domain cluster 1600, and RSCT (with RMC) is the common interface for clustering. For more information about this heterogeneous cluster, see An Introduction to CSM 1.3 for AIX 5L, SG24-6859. RSCT includes these components: -Resource Monitoring and Control (RMC) -Resource managers (RM) -Cluster Security Services (CtSec) -Group Services -Topology Services Group Services and Topology Services Group Services and Topology Services, although included in RSCT, are not used in the management domain structure of CSM. These two components are used in peer domain clusters for applications, such as High-Availability Cluster Multiprocessing (HACMP) and General Parallel File System (GPFS), providing node and process coordination and node and network failure detection. Therefore, for these applications, a .rhosts file may be needed (for example, for HACMP configuration synchronization). These services are often referred to as hats and hags: high availability Group Services daemon (hagsd) and high availability Topology Services daemon (hatsd). - What are management domains and peer domains? In order to understand how the various RSCT components are used in a cluster, you should be aware that nodes of a cluster can be configured for either manageability or high availability. >> You configure a set of nodes for manageability using the Clusters Systems Management (CSM) product as described in IBM® Cluster Systems Management: Administration Guide. The set of nodes configured for manageability is called a management domain of your cluster. >>You configure a set of nodes for high availability using RSCT's Configuration resource manager. The set of nodes configured for high availability is called an RSCT peer domain of your cluster. For more information, refer to Creating and administering an RSCT peer domain. -- HPSS: -- ----- High Performance Storage System What is High Performance Storage System? HPSS is software that manages petabytes of data on disk and robotic tape libraries. HPSS provides highly flexible and scalable hierarchical storage management that keeps recently used data on disk and less recently used data on tape. HPSS uses cluster, LAN and/or SAN technology to aggregate the capacity and performance of many computers, disks, and tape drives into a single virtual file system of exceptional size and versatility. This approach enables HPSS to easily meet otherwise unachievable demands of total storage capacity, file sizes, data rates, and number of objects stored. HPSS provides a variety of user and filesystem interfaces ranging from the ubiquitous vfs, ftp, samba and nfs to higher performance pftp, client API, local file mover and third party SAN (SAN3P). HPSS also provides hierarchical storage management (HSM) services for IBM General Parallel File System (GPFS). -- C-SPOC: -- ------- The Cluster Single Point of Control (C-SPOC) utility lets system administrators perform administrative tasks on all cluster nodes from any node in the cluster. -- HA Network Server: -- ------------------ The High Availability Network Server (HA Network Server) is a complete solution that quickly and automatically configures certain network services in a high availability environment. HA Network Server solution is designed to enhance the HACMP product by offering a set of scripts that set up highly available network services such as Domain Name System (DNS), Dynamic Host Configuration Protocol (DHCP), Network File System (NFS), and printing services. This is possible by using the framework offered in HACMP to monitor and act upon potential problems with network services in order to extend high availability beyond just hardware recovery. Making these services highly available means there is no down time in services that are critical to running a business. This solution is now available by download. HA Network Server components The HA Network Server solution is comprised of three network service plug-ins providing for DNS, DHCP, and print services (HACMP already contains integrated support for high availability NFS (HANFS)). Each of these plug-ins is available on this Web site as a downloadable tar file. These example scripts start and stop the network service processes, verify that configuration files are present and stored in a shared filesystem, and assist the HACMP monitoring functions that check on the health of the network service process. These scripts are provided as examples that may be customized for your environment. A setup program is also provided with each of these plug-ins to assist with the setup after downloading the plug-in. Since several prerequisites must be completed by the user before setup begins, please read the README file that is included within the plug-in tar file. After download and tar file expansion, the README will be located in /usr/es/sbin/cluster/plug-ins/, where will be dns, dhcp, or printserver depending on which plug-in was downloaded. 9.2: Items in HACMP: ===================== Application Servers: -------------------- To put the application under HACMP control, you create an application server resource that associates a user-defined name with the names of specially written scripts to start and stop the application. By defining an application server, HACMP can start another instance of the application on the takeover node when a fallover occurs. This protects your application so that it does not become a single point of failure. An application server can also be monitored with the application monitoring feature and the Application Availability Analysis tool. After you define the application server, you can add it to a resource group. A resource group is a set of resources that you define so that the HACMP software can treat them as a single unit. Application Monitoring: ----------------------- HACMP can monitor applications that are defined to application servers, in one of two ways: -Process monitoring detects the termination of a process, using RSCT Resource Monitoring and Control (RMC) capability. -Custom monitoring monitors the health of an application based on a monitor method that you define. Daemons: -------- Cluster Services Notice that if you list the daemons in the AIX System Resource Controller (SRC), you will see ES appended to their names. The actual executables do not have the ES appended; the process table shows the executable by path (/usr/es/sbin/cluster...). The following lists the required and optional HACMP/ES daemons: - Cluster Manager daemon (clstrmgr): This daemon monitors the status of the nodes and their interfaces, and invokes the appropriate scripts in response to node or network events. It also centralizes the storage of and publishes updated information about HACMP-defined resource groups. The Cluster Manager on each node coordinates information gathered from the HACMP global ODM, and other Cluster Managers in the cluster to maintain updated information about the content, location, and status of all HACMP resource groups. This information is updated and synchronized among all nodes whenever an event occurs that affects resource group configuration, status, or location. All cluster nodes must run the clstrmgr daemon. - Cluster SMUX Peer daemon (clsmuxpd): This daemon maintains status information about cluster objects. This daemon works in conjunction with the Simple Network Management Protocol (snmpd) daemon. All cluster nodes must run the clsmuxpd daemon. Note: The clsmuxpd daemon cannot be started unless the snmpd daemon is running. - Cluster Information Program daemon (clinfo): This daemon provides status information about the cluster to cluster nodes and clients and invokes the /usr/es/sbin/cluster/etc/clinfo.rc script in response to a cluster event. The clinfo daemon is optional on cluster nodes and clients. - Cluster Lock Manager daemon (cllockd): This daemon provides advisory locking services. The cllockd daemon is required on cluster nodes only if those nodes are part of a concurrent access configuration. - Cluster Topology Services daemon (topsvcsd): This daemon monitors the status of network adapters in the cluster. All cluster nodes must run the topsvcsd daemon. - Cluster Event Management daemon (emsvcsd): This daemon matches information about the state of system resources with information about resource conditions of interest to client programs (applications, subsystems, and other programs).The emsvcsd daemon runs on each node of a domain. - Event Management AIX Operating System Resource Monitor (emaixos): This daemon acts as a resource monitor for the event management subsystem and provides information about the operating system characteristics and utilization. The emaixos daemon is started automatically by Event Management - Cluster Group Services daemon (grpsvcsd): This daemon manages all of the distributed protocols required for cluster operation. All cluster nodes must run the grpsvcsd daemon. - Cluster Globalized Server Daemon daemon (grpglsmd): This daemon operates as a grpsvcs client; its function is to make switch adapter membership global across all cluster nodes. All cluster nodes must run the grpglsmd daemon. - Group Services Concurrent Logical Volume Manager (gsclvmd). When extended concurrent Volume Groups are used, this process manages concurrent Volumes. The AIX System Resource Controller (SRC) controls the HACMP/ES daemons (except for cllockd, which is a kernel extension). It provides a consistent interface for starting, stopping, and monitoring processes by grouping sets of related programs into subsystems and groups. In addition, it provides facilities for logging of abnormal terminations of subsystems or groups and for tracing of one or more subsystems. The HACMP/ES daemons are collected into the following SRC subsystems and groups: Daemon Subsystem Group /usr/es/sbin/cluster/clstrmgr clstrmgrES cluster /usr/es/sbin/cluster/clinfo clinfoES cluster /usr/es/sbin/cluster/clsmuxpd clsmuxpdES cluster /usr/es/sbin/cluster/cllockd cllockdES lock /usr/sbin/rsct/bin/emsvcs emsvcs emsvcs /usr/sbin/rsct/bin/topsvcs topsvcs topsvcs /usr/sbin/rsct/bin/hagsglsmd grpglsm grpsvcs /usr/sbin/rsct/bin/emaixos emsvcs emsvcs /usr/es/sbin/cluster/clcomd clcomdES clcomd When using the SRC commands, you can control the clstrmgr, clinfo, and clsmuxpd daemons by specifying the SRC cluster group. The required and optional HACMP and RSCT daemons are: - clcomdES Cluster communication daemon - clstrmgrES Cluster manager - clinfoES Cluster information daemon - rmcd RSCT resource Monitoring and Control daemon - hatsd RSCT Topology Services subsystem (includes hats_nim* which send and receives heartbeats) - hagsd RSCT group services subsystem - grpglsmd main function is to make switch adapter membership global accross all cluster nodes. Starting with hacmp 5.3, the cluster manager process is always running. It can be in one of two states, as displayed by the command # lssrc -ls clstrmgrES ST_INIT (start event has executed) ST_NOTCONFIGURED (start event has not executed) Understanding Cluster Service Startup: -------------------------------------- You start cluster services on a node by executing the HACMP/ES /usr/es/sbin/cluster/etc/rc.cluster script. Use the Start Cluster Services SMIT screen, described in the section Starting Cluster Services, to build and execute this command. The rc.cluster script initializes the environment required for HACMP/ES by setting environment variables and then calls the /usr/es/sbin/cluster/utilities/clstart script to start the HACMP/ES daemons. The clstart script is the HACMP/ES script that starts all the cluster services. The clstart script calls the SRC startsrc command to start the specified subsystem or group. The following figure illustrates the major commands and scripts called at cluster startup: rc.cluster -> clstart -> startsrc The HACMP/ES daemons are started in the following order: -RSCT daemons (Group Services, Topology Services, then Event Management) -Cluster Manager -Cluster SMUX daemon -Cluster Information Program daemon (optional) Using the C-SPOC utility, you can start cluster services on any node (or on all nodes) in a cluster by executing the C-SPOC /usr/es/sbin/cluster/sbin/cl_rc.cluster command on a single cluster node. The C-SPOC cl_rc.cluster command calls the rc.cluster command to start cluster services on the nodes specified from the one node. The nodes are started in sequential order, not in parallel. The output of the command run on the remote node is returned to the originating node. Because the command is executed remotely, there can be a delay before the command output is returned. The following example shows the major commands and scripts executed on all cluster nodes when cluster services are started in clusters using the C-SPOC utility. NODE A NODE B cl_rc.cluster | \rsh | \ rc.cluster rc.cluster | | | | clstart clstart | | | | startsrc startsrc -- Automatically Restarting Cluster Services You can optionally have cluster services start whenever the system is rebooted. If you specify the -R flag to the rc.cluster command, or specify "restart or both" in the Start Cluster Services SMIT screen, the rc.cluster script adds the following line to the /etc/inittab file. hacmp:2:wait:/usr/es/sbin/cluster/etc/rc.cluster -boot> /dev/console 2>&1 # Bring up Cluster At system boot, this entry causes AIX to execute the /usr/es/sbin/cluster/etc/rc.cluster script to start HACMP/ES. WARNING: Be aware that if the cluster services are set to restart automatically at boot time, you may face problems with node integration after a power failure and restoration, or you may want to test a node after doing maintenance work before having it rejoin the cluster. -- Starting Cluster Services with IP Address Takeover Enabled If IP address takeover is enabled, the /usr/es/sbin/cluster/etc/rc.cluster script calls the /etc/rc.net script to configure and start the TCP/IP interfaces and to set the required network options. -- Editing the rc.cluster File to Turn Deadman Switch Off In HACMP/ES, the Deadman Switch (DMS) is controlled by RSCT Topology Services. If, in a rare case, you want to turn the DMS off, you must edit the rc.cluster file as follows: There is a -D flag in clstart, located in /usr/es/sbin/cluster/utilities In the /usr/es/sbin/cluster/etc/rc.cluster file, find a call to "clstart" at about line #486. Edit this call to include the -D flag. Understanding Stopping Cluster Services: ---------------------------------------- You stop cluster services on a node by executing the HACMP/ES /usr/es/sbin/cluster/utilities/clstop script. Use the HACMP for AIX Stop Cluster Services SMIT screen, described in the section Stopping Cluster Services to build and execute this command. The clstop script stops an HACMP/ES daemon or daemons. The clstop script starts all the cluster services or individual cluster services by calling the SRC command stopsrc. The following figure illustrates the major commands and scripts called at cluster shutdown: clstop -> stopsrc Using the C-SPOC utility, you can stop cluster services on a single node or on all nodes in a cluster by executing the C-SPOC /usr/es/sbin/cluster/sbin/cl_clstop command on a single node. The C-SPOC cl_clstop command performs some cluster-wide verification and then calls the clstop command to stop cluster services on the specified nodes. The nodes are stopped in sequential order, not in parallel. The output of the command run on the remote node is returned to the originating node. Because the command is executed remotely, there can be a delay before the command output is returned. NODE A NODE B cl_clstop | \rsh | \ clstop clstop | | | | stopsrc stopsrc Starting and stopping using smitty: To start cluster services, use smit cl_admin -> Manage HACMP Services -> Start Cluster Services To stop cluster services, use smit cl_admin -> Manage HACMP Services -> Stop Cluster Services 9.3: Most important commands in HACMP: ======================================= 9.4 Other notes on HACMP: ========================== Filesets and compatibility list HACMP versions - AIX versions: Note 1: ------- HACMP Version Compatibility Matrix http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347 Document Author: Shawn Bodily Document ID: TD101347 Doc. Organization: Advanced Technical Support Document Revised: 03/06/2007 Product(s) covered: HACMP Abstract: This document provides a HACMP Version Compatibility Matrix. HACMP Version Supported? AIX Level(s) MISC 1.2 NO 3.2.5 2.1 NO 3.2.5 3.1.0 NO 3.2.5 3.1.1 NO 3.2.5 4.1.0 NO 4.1.X 4.1.1 NO 4.1.X 4.2 NO 4.1.4, 4.2.X 4.2.1 NO 4.1.5, 4.2.X 4.2.2 NO 4.1.5, 4.2.1, 4.3.X 4.3 NO 4.3.2, 4.3.3 4.3.1 NO 4.3.2, 4.3.3 4.4 NO 4.3.3 4.4.1 NO 4.3.3, 5.1 4.5 NO 5.1, 5.2 5.1 NO-09/01/2006 5.1, 5.2,5.3 5.2 Y-9/30/2007 5.1, 5.2,5.3 5.3 Y-9/30/2008 5.2(ML4), 5.3(ML2) AIX 5.2 RSCT 2.3.6 or higher AIX 5.3 RSCT 2.4.2 or higher 5.4 Yes 5.2 (TL8), 5.3(TL4) AIX 5.2 RSCT 2.3.9 or higher AIX 5.3 RSCT 2.4.5. or higher Cross Reference Chart AIX 4.3.3 AIX 5.1 AIX 5.1(64-bit) AIX 5.2 AIX 5.3 HACMP 4.5 No Yes No Yes No HACMP/ES 4.5 No Yes Yes Yes No HACMP/ES 5.1 No Yes Yes Yes Yes HACMP/ES 5.2 No Yes Yes Yes Yes HACMP/ES 5.3 No No No Yes Yes HACMP/ES 5.4 No No No Yes Yes Note 2: ------- HACMP 5.1 requires: - AIX 5L v5.1 ML5 with RSCT v2.2.1.30 or higher - AIX 5L v5.2 ML2 with RSCT v2.3.1.0 or higher - c-spoc vpath support requires SDD 1.3.1.3 or higher HACMP 5.2: AIX Each cluster node must have one of the following installed: AIX 5L v5.1 plus the most recent maintenance level (minimum ML 5) AIX 5L v5.2 plus the most recent maintenance level (minimum ML 2) HACMP 5.3 is supported on AIX 5.2 and 5.3 - AIX 5.2 ML06 or later with RSCT 2.3.6 or later - AIX 5.3 ML02 or later with RSCT 2.4.2 or later Note 3: HACMP FAQ: ------------------ I have installed HACMP, now what? Why does HACMP require so many subnets for IP address takeover? Does HACMP have any limits? How can I avoid the nameserver as a single point-of-failure? What is a config_too_long event? Do all cluster nodes need to be at the same version of HACMP and AIX 5L operating system? Why do I need a non-IP heartbeat network? Can I put different types of processors, communications adapters, or disk subsystems in the same cluster? What kinds of applications are best suited for a high availability environment? Can I use Etherchannel with HACMP? Can I use an existing Enhanced Concurrent Mode volume group for disk heartbeat? Or do I need to define a new one? Question: I have installed HACMP, now what? Answer: Before HACMP can manage and keep your application highly available, you need to tell HACMP about your cluster and the application. There are 4 steps: Step 1) Define the nodes that will keep your application highly available The local node (the one where you are configuring HACMP) is assumed to be one of the cluster nodes and you must give HACMP the name of the other nodes that make up the cluster. Just enter a hostname or IP address for each node. Step 2) Define the application you want to keep highly available There are 3 things you need to tell HACMP about the application: name—provide a name start script—specify a script for HACMP to use to start the application stop script—specify a script for HACMP to use to stop the application Step 3) Verify and synchronize the cluster HACMP will discover all the networks and disks connected to the nodes. A verification step will ensure that the cluster configuration will be able to keep the application highly available. When successful the configuration will be copied to the rest of the nodes in the cluster. Step 4) Manage the application When you start HACMP it will begin managing the application and keeping it highly available. You can also use the maintenance facilities provided by HACMP to move the application between nodes for maintenance purposes. To see just how easy it is to configure HACMP, look for Using the SMIT Assistant in Chapter 11 of the Installation Guide. View the online documentation for HACMP. HACMP for Linux does not include the advanced discovery and verification features available on AIX 5L. When configuring HACMP for Linux you must manually define the cluster, networks and network interfaces. Any changes to the configuration require HACMP for Linux to be restarted on all nodes. Question: Why does HACMP require so many subnets for IP address takeover? Answer: HACMP (using RSCT) determines adapter state by sending heartbeats across a specific network interface —as long as heartbeat messages can be sent through an interface, the interface is considered alive. Prior to AIX 5L V5, AIX did not allow more than one interface to own a subnet route but in AIX 5L V5.1 multiple interfaces can have a route to the same subnet. This is sometimes referred to as multipath routing or route striping and when this situation exists, AIX 5L will multiplex outgoing packets destined for a particular subnet across all interfaces with a route to that subnet. This interferes with RSCT's ability to reliably send heartbeats to a specific interface. Therefore the subnetting rules for boot, service and persistent labels are such that there will never be a duplicate subnet route created by the placement of these addresses. HACMP V5 includes a new feature whereby you may be able to avoid some of the subnet requirements by configuring HACMP to use a different set of IP alias addresses for heartbeat. With this feature you provide a base or starting address and HACMP calculates a set of addresses in proper subnets—when cluster services are active, HACMP adds these addresses as IP alias addresses to the interfaces and then uses these alias addresses exclusively for heartbeat traffic. You can then assign your "regular" boot, service and persistent labels in any subnet, but be careful: although this feature avoids multipath routing for heartbeat, multipath routing may adversely affect your application. Heartbeat via IP Aliasing is discussed in Chapter 2 of the Concepts and Facilities Guide and Chapter 3 of the Administration and Troubleshooting Guide. View the online documentation for HACMP. Question: Does HACMP have any limits? Answer: The functional limits for HACMP (e.g. number of nodes and networks) can be found in Chapter 1 of the Planning and Installation Guide. View the online documentation for HACMP. Question: How can I avoid the nameserver as a single point-of-failure? Answer: 1) Make the nodes look at /etc/hosts first before the nameserver by creating a /etc/netsvc.conf file with the following entry: hosts=local,bind where local tells it to look at /etc/hosts first and then the nameserver 2) Remove /etc/resolv.conf (or modify name to save it for later use) so it looks for name resolution in /etc/hosts first. For information on updating the /etc/hosts file and nameserver configuration, Installation Guide. View the online documentation for HACMP. Question: What is a config_too_long event? Answer: The config_too_long event is an informational event run by HACMP whenever a cluster event runs for longer that a preset time. This can occur when: an AIX 5L command (e.g. fsck) is taking a long time to complete, or has hung there was an un-recoverable error encountered – in this case there will be an "EVENT FAILED" indication in hacmp.out If the config_too_long event is run, you should check the hacmp.out file to determine the cause and if manual intervention is required. For more information on recovery after an event failure, refer to Recover from HACMP Script Failure in Chapter 18 of the Administration and Troubleshooting Guide. Question: Do all cluster nodes need to be at the same version of HACMP and AIX 5L operating system? Answer: No, though there are some restrictions when running mixed mode clusters. Mixed levels of AIX 5L on cluster nodes do not cause problems for HACMP as long as the level of AIX 5L is adequate to support the level of HACMP being run on that node. All cluster operations are supported in such an environment. The HACMP install and update packaging will enforce the minimum level of AIX 5L required on each system. Similarly for Linux on POWER, different levels of the operating system should not cause problems as long as the minimum supported version is installed. Mixing different platforms—AIX 5L, RedHat and SUSE—within the same cluster is not supported. As a matter of practicality, it is recommended that all nodes be at the same levels of operating system and HACMP whenever possible. Keeping, the operating system, HACMP and the application at the same level on all nodes will make the administration of the cluster easier and less error prone, and will go a long way towards reducing the frustration of the administrators. The Planning Guide has advice for effectively managing different installation and migration scenarios. Question: Why do I need a non-IP heart beat network? Answer: The purpose of the non-IP heartbeat link is often misunderstood. The requirement comes from the following: HACMP heartbeats on IP networks are sent as UDP datagrams. This means that if a node or network is congested, the heartbeats can be discarded. If there were only IP networks, and if this congestion went on long enough, the node would be seen as having failed and HACMP would initiate a takeover. Since the node is still alive, HACMP takeover can cause both nodes to have the same IP address, and can cause the nodes to both try to own and access the shared disks. This situation is sometimes referred to as "split brain" or "partitioned cluster". Data corruption is all but inevitable in this circumstance. HACMP therefore strongly recommends that there be at least one non-IP network connecting a node to at least one other node. For clusters with more than two nodes, the most reliable configuration includes two non-IP networks on each node. The distance limitations on non-IP links—particularly RS-232—has often made this requirement difficult to meet. For such clusters, HACMP disk heartbeating should be strongly considered. Disk heartbeating enables the easy creation of multiple non-IP networks without requiring additional hardware or software. Question: Can I put different types of processors, communications adapters, or disk subsystems in the same cluster? Answer: In general, yes, as long as the individual components are supported by HACMP. Note that there are some combinations which may not be reasonable or desirable. For example, putting two Ethernet adapters that run at different speeds on the same network will generally force all adapters on the network to run at the speed of the slower one. Likewise, having a low powered processor back up a high-powered processor may result in unacceptable performance should HACMP have to run the application on the lower powered one. (But see the questions on dynamic LPAR and CUoD for a way of dealing with this). As long as AIX 5L and the hardware support the interconnections, HACMP will support them as well. Question: What kinds of applications are best suited for a high availability environment? Answer: HACMP detects failures in the cluster then moves or restarts resources in order to keep the application highly available. For an application to work well in a high availability environment, the application itself must be capable of being managed (start, stop, restart) programmatically (no user intervention required) and must have no "hard coded" dependencies on specific resources. For example, if the application relies on the hostname of the server (and cannot dynamically accept a change in hostname), then it is practically impossible to restart the application on a backup server after a failure. Question: Can I use Etherchannel with HACMP? Answer: See Using Etherchannel with HACMP. Question: Can I use an existing Enhanced Concurrent Mode volume group for disk heartbeat? Or do I need to define a new one? Answer: To achieve the highest levels of availability under the widest range of failure scenarios, the best practice would be to configure one disk heartbeat connection per physical disk enclosure (or LUN). The heartbeat operation itself involves reading and writing messages from a non-data area of the shared disk. Although the space used for heartbeat messages does not decrease the space available for the application (it is in the reserved area of the disk) there is some overhead when the disk seeks back and forth between the reserved area and the application data area. If you configure the disk heartbeat path using the same disk and vg as is used by the application, the best practice is to select a disk which does not have frequently accessed or performance critical application data: although the disk heartbeat overhead is small (2-4 seeks/sec), it could potentially impact application performance or, conversely, excess application access could cause the disk hb connection to appear to go up and down. Ultimately the decision of which disk and volume group to use for heartbeat depends on what makes sense for your shared disk environment and management procedures. For example, using a separate vg just for heartbeat isolates the heartbeat from the application data, but adds another volume group that has to be maintained (during upgrades, changes, etc) and consumes another LUN. If you decide on a separate vg for heartbeat, it does not need to be included in an HACMP resource group, however, the CSPOC utilities use a resource group node list as the set of nodes to perform operations: including the vg in a resource group with just the (sub)set of nodes connected to the disk will let you take advantage of the CSPOC functions. You can also define and use a disk which is not part of any volume group, though such a setup would have to be manually configured and maintained. Note 4: Cluster logfiles: ------------------------- Cluster log files HACMP for AIX scripts, daemons, and utilities write messages to the log files shown below. HACMP log files Log file name Description /var/adm/cluster.log Contains time-stamped, formatted messages generated by HACMP for AIX scripts and daemons. In this log file, there is one line written for the start of each event, and one line written for the completion. /tmp/hacmp.out Contains time-stamped, formatted messages generated by the HACMP for AIX scripts. In verbose mode, this log file contains a line-by-line record of each command executed in the scripts, including the values of the arguments passed to the commands. By default, the HACMP for AIX software writes verbose information to this log file; however, you can change this default. Verbose mode is recommended. system error log Contains time-stamped, formatted messages from all AIX subsystems, including the HACMP for AIX scripts and daemons. /usr/sbin/cluster/ history/cluster.mmdd Contains time-stamped, formatted messages generated by the HACMP for AIX scripts. The system creates a new cluster history log file every day that has a cluster event occurring. It identifies each day's file by the file name extension, where mm indicates the month and dd indicates the day. /tmp/cm.log Contains time-stamped, formatted messages generated by HACMP for AIX clstrmgr activity. Information in this file is used by IBM Support personnel when the clstrmgr is in debug mode. Note that this file is overwritten every time cluster services are started; so, you should be careful to make a copy of it before restarting cluster services on a failed node. /tmp/cspoc.log Contains time-stamped, formatted messages generated by HACMP for AIX C-SPOC commands. Because the C-SPOC utility lets you start or stop the cluster from a single cluster node, the /tmp/cspoc.log is stored on the node that initiates a C-SPOC command. /tmp/dms_logs.out Stores log messages every time HACMP for AIX triggers the deadman switch. /tmp/emuhacmp.out Contains time-stamped, formatted messages generated by the HACMP for AIX Event Emulator. The messages are collected from output files on each node of the cluster, and cataloged together into the /tmp/emuhacmp.out log file. In verbose mode (recommended), this log file contains a line-by-line record of every event emulated. Customized scripts within the event are displayed, but commands within those scripts are not executed. /var/hacmp/clverify /clverify.log Contains messages when the cluster verification has run. ############################################################################################# ############################################################################################# ############################################################################################# =============================================================== Section 10. Cisco IOS version 10.x, 11.x, 12.x router commands: =============================================================== PART 1: Basic IOS commands: =========================== 1. Entering user mode, or privileged mode, or configuration mode: ----------------------------------------------------------------- - user mode ----------- When you access a router through console, aux, or remote terminal, you first enter the router in "user exec mode" (user mode). Here you can see all settings but you can not change anything. login to IOS via console, aux, or via a terminal via network -> you enter user exec mode first. - privileged mode ----------------- Via the "enable" command you can enter "privileged mode" whereby you can enter configuration mode and change settings of the router router>enable pasword: xxxx router# goiing back to user mode router#disable router> logout router>logout - configuration mode -------------------- When you are in privileged mode, you can enter the "configuration mode": - change running config router# configure terminal (or just config t) router(config)# - change startup config in NVRAM router# configure memory (or just config mem) router(config)# so, user mode -> via 'enable' -> privileged mode -> via 'config t' ->configuration mode Getting out from configuration mode can be done with "exit" or "Ctrl-Z" - exit brings you 'one level higher' - Ctrl-Z gets you out configuration mode examples: -- first logon to router password: xxxx router>enable password: yyyy router#configure terminal router(config)#enable password abcd router(config)#enable secret abcd router(config)#line console 0 router(config-line)#login router(config-line)#password cisco router(config-line)#line vty 0 4 router(config-line)#login router(config-line)#password cisco router(config)#service password-encryption router(config)#no service password-encryption router(config-line)#hostname critter critter(config)#prompt emma emma(config)#interface serial 1 emma(config-if)#exit emma(config)#exit emma# router(config)#interface fastethernet0/0 router(config-if)# router(config)#int f0/0.1 router(config-subif)# router#config t router(config)#router rip router(config-router)# clock: if the router must provide clocksignal router(config)#interface serial 0 router(config-line)#clock rate 64000 banners: exec, incoming,login, motd router(config)#banner motd # ... enter the banner text.... end with # Prompts: ROMMON 1> Monitor mode ROUTER> user mode ROUTER# privileged mode router(config)# global configuration mode router(config-if)# interface configuration mode router(config-subif)# Sub-interface configuration mode router(config-line)# line configuration mode router(config-router)# router configuration mode router(config-ipx-router)# ipx router configuration mode Router>enable Router#config t Enter configuration commands, one per line. End with CNTL/Z. Router(config)#exit Router#exit -- ends the session 2. Logging and debugging commands: ---------------------------------- IOS creates (syslog) messages and by default, sends them to the console. But when you have a telnet session for example, no syslog messages are seen. router>terminal monitor means that this terminal is monitoring syslog messages or router>logging buffered means let the router buffer the messages router>show logging is the command to display the messages to your terminal session 3. Memory types and configuration types in Cisco routers: --------------------------------------------------------- When the router boots, it loads it's IOS from FLASH memory, which is some sort of PCMCIA card or EEPROM. The configuration of the router (address lists, ip addresses on interfaces etc..) is stored as the "startup configuration" in NVRAM which will be loaded into RAM as the "working configuration". RAM: working memory, with loaded IOS from FLASH, and running configuration initally loaded from NVRAM ROM: basic IOS software, should not be used normally FLASH: IOS software (=rewriteable permanent memory) NVRAM: contains startup, and saved, configuration (=Non Volatile RAM) You can display the "startup configuration" in NVRAM, and the "running configuration" in RAM with the following commands: router#show running-config router#show startup-config 4. copy of configuration files: ------------------------------- You can copy the running configuration to the startup configuration, and the other way around. You can also store the configuration to an ascii file via TFTP router#copy running-config startup-config router#copy startup-config running-config router#copy tftp startup-config router#copy startup-config tftp erase the startup-config: router#erase startup-config If you have an new IOS and want to load it into the router: router#copy tftp flash And you must reload or reboot the router. 5. BOOT procedure router: ------------------------- 1. power on self test 2. router loads bootstrap code from ROM 3. router finds IOS from flash and loads it 4. router finds startup configuration file and loads it as running configuration If no configuration is found in NVRAM, the router goes to setup mode Here will be asked to go choose from basic or extended setup mode The "config register" command: You can change the normal sequence by setting the "configuration register" to some other value. This register is a 16 bit register in the router which can be set by the "config register" command. The bootfield of the register are the first 4 bits. If the bootfield in hex is - 0: 2100 - load ROMMON; is used for lowlevel debugging or password recovery - 1: 2101 - RXBOOT; is used to load the limited function IOS from ROM - 2: 2102 - load normal IOS example: config-register 0x2101 bit 6 can be used to ignore the NVRAM, for recovering password put the config-register at 0x2141 6. CDP protocol: ---------------- CDP is enabled by default. S#no cdp run -- global command, disabling cdp S#cdp run -- enabling cdp S#(config-if)#no cdp enable -- disabling cdp for this interface S#(config-if)#cdp enable -- enabling cdp for this interface S#show cdp neighbour S#show cdp neighbour detail S#show cdp entry yosemite S#show cdp entry yosemite protocol S#show cdp interface S#show cdp traffic 7. Configuration interfaces example: ------------------------------------ hostname Gorno enable password cisco interface Serial0 ip address 134.141.12.1 255.255.255.0 interface Serial1 ip address 134.141.13.1 255.255.255.0 interface Ethernet0 ip address 134.141.1.1 255.255.255.0 -- to enable rip (classfull) RouterA(config)#router rip RouterA(config-router)network 134.141.0.0 -- to disable rip no router rip -- to disable rip on 1 interface RouterA(config)#router rip RouterA(config-router)#passive-interface serial 0 - Add a route: ip route network-number network-mask ip-address ip name-server server-address1 serveraddress-2... ip domain-lookup ip route 10.1.2.0 255.255.255.0 10.1.128.252 ip address 10.1.7.252 255.255.255.0 seconday ip address 10.1.2.252 255.255.255.0 default route example: ---------------------- R1(config)# ip route 0.0.0.0 0.0.0.0 168.13.1.101 PART 2. NETWORK CONFIGURATIONS: =============================== 8. IP/IPX configuration on point-to-point ------------------------------------------ 8.1 IP configuration on point-to-point serial links: ---------------------------------------------------- LAPB, HDLC, and PPP are used for a single point-to-point serial link. See section 10. ----- | A / \ Y---S | | --- --- Albequerque# A#configure terminal A(config)# interface serial 0 A(config-if)#ip address 10.1.128.251 255.255.255.0 A(config)# interface serial 1 A(config-if)#ip address 10.1.130.251 255.255.255.0 A(config)# interface ethernet 0 A(config-if)#ip address 10.1.1.251 255.255.255.0 A#show running-config A#show ip route 10.0.0.0/24 is subnetted, 3 subnets C 10.1.1.0 is directly connected, Ethernet0 C 10.1.130.0 is directly connected, Serial1 C 10.1.128.0 is directly connected, Serial0 A#terminal ip netmask-format decimal -- used to go from /24 notation -- to 255.255.255.0 A#show ip route Yosemite# Y#show ip interface brief Interface IP-Address OK? Method Status Protocol Serial0 10.1.128.252 YES Manual up up Serial1 10.1.129.252 YES Manual up up Ethernet0 10.1.2.252 YES Manual up up Seville# S#show ip route S#show ip interface serial 1 S#show ip interface serial 0 S#show ip arp S#debug ip packet IP packet debugging is on S#ping 10.1.130.251 Add static routes: A#ip route 10.1.2.0 255.255.255.0 10.1.128.252 A#ip route 10.1.3.0 255.255.255.0 10.1.130.253 A#show ip route 10.0.0.0/24 is subnetted, 5 subnets S 10.1.3.0 [1/0] via 10.1.130.253 S 10.1.2.0 [1/0] via 10.1.128.252 C 10.1.1.0 is directly connected, Ethernet0 C 10.1.130.0 is directly connected, Serial1 C 10.1.128.0 is directly connected, Serial0 Set a default route: R1(config)#ip route 0.0.0.0 0.0.0.0 10.1.17.251 If you use a default route, you should use the command router(config)#ip classless 8.2 IPX configuration on point-to-point serial links: ----------------------------------------------------- ----- | A / \ Y---S | | --- --- =Router Alburquerque: ipx routing 0200.aaaa.aaaa (mac address lan) interface serial0 ip address 10.1.12.1 255.255.255.0 ipx network 1012 bandwith 56 interface serial1 ip address 10.1.13.1 255.255.255.0 ipx network 1013 interface ethernet 0 ip address 10.1.1.1 255.255.255.0 ipx network 1 =Router Yosemite: ipx routing 0200.bbbb.bbbb interface serial0 ip address 10.1.12.2 255.255.255.0 ipx network 1012 bandwith 56 interface serial1 ip address 10.1.23.1 255.255.255.0 ipx network 1023 interface ethernet 0 ip address 10.1.2.2 255.255.255.0 ipx network 2 ------------------------ A#show interface serial 0 A#show interface Ethernet0 A#sh int e0 A#show ipx interface serial0 A#show ip interface serial 0 A#show ip interface brief A#show ip route A#show ipx route A#show ipx servers A#debug ipx routing activity (IPXRIP activity) A#debug ipx routing events (IPXRIP events) A#debug ipx sap activity (IPXSAP activity) A#undebug all A#no debug all 9. Configuring RIP and IGRP: ---------------------------- Each network command enables RIP or IGRP on a set of interfaces. RIP: interface ethernet 0 ip address 10.1.2.3 255.255.255.0 interface ethernet 1 ip address 172.16.1.1 255.255.255.0 interface tokenring 0 ip address 10.1.3.3 255.255.255.0 interface serial 0 ip address 199.1.1.1 255.255.255.0 interface serial 1 ip address 199.1.2.1 255.255.255.0 R1#configure terminal R1(config)#router rip R1(config-router)#network 199.1.1.0 R1(config-router)#network 10.0.0.0 -- Ethernet0, Tokenring0, Serial0 have rip enabled IGRP: R1#configure terminal R1(config)#router igrp 1 -- autonomous system id R1(config-router)#network 199.1.1.0 R1(config-router)#network 10.0.0.0 R1(config-router)#network 199.1.2.0 R1(config-router)#network 172.16.0.0 -- all interfaces have now igrp enabled EIGRP: router eigrp (autonomous system id) network command for example router eigrp 10 network 10.0.0.0 network 172.16.0.0 DEBUGGING: R1#debug ip rip R1#debug ip igrp transactions R1#debug ip igrp events R1#no debug all R1#undebug all DISABLE RIP: R1(config)#no router rip 10. Serial links: ----------------- LAPB, HDLC, and PPP are used for a single point-to-point serial link. Error detection Protocol type field SDLC Yes None LAPB Yes None LAPD No None HDLC Yes None / Yes Cisco proprierty PPP Yes Yes -- encapsulation hdlc | ppp | lapb hdlc is default R1(config)#interface serial 0 R1(config-if)encapsulation ppp R1(config)#interface serial 0 R1(config-if)encapsulation hdlc -- compress predictor | stac | mppc R1(config)#interface serial 0 R!(config-if)ip address 10.1.11.253 255.255.255.0 R1(config-if)encapsulation ppp R1(config-if)compress stac R1#show compress R1#show process - ppp: LCP control protocols like IPCP, LQM, looped link detection, Authentication compression, mulitlink support - ppp, lapb, hdlc all support compression ppp : stac, predictor, mppc lapb: stac, predictor hdlc: stac - synchronous serial interface 60 pin D V.35, X.21, EIA/TIA-232, EIA/TIA-449, EIA/TIA-530 11. Frame Relay: ---------------- key terms: DTE, DCE, VC, DLCI, LMI, DE, FECN, BECN, LAPF, ITU Q.9xx/ANSI T1.6xx encapsulation frame-relay (ietf|cisco) frame-relay lmi-type (cisco|ansi|q933a) frame-relay map (ip nr - dlci nr) frame-relay interface-dlci (dlci-nr) bandwith num keepalive sec show ip route show frame-relay pvc show frame-relay map show frame-relay lmi show interfaces show interface s0 debug frame-relay lmi 11.1 One IP subnet/IPX network: ------------------------------- ----- | A dlci 51 199.1.1.1 / \ dlci 52 B---C dlci 53 199.1.1.3 199.1.1.2 | | --- --- example 1: lmi automatical, cisco instead ietf etc.. Router A: ipx routing 0200.aaaa.aaaa interface serial 0 encapsulation frame-relay ip address 199.1.1.1 255.255.255.0 ipx network 199 interface ethernet 0 ip address 199.1.10.1 255.255.255.0 ipx network 1 router igrp 1 network 199.1.1.0 network 199.1.10.0 Similar for routers B and C.... example 2: lmi is ansi: Router A: ipx routing 0200.aaaa.aaaa interface serial 0 encapsulation frame-relay frame-relay lmi-type ansi ip address 199.1.1.1 255.255.255.0 ipx network 199 ... Mayberry#show ip route Mayberry#show frame-relay pvc Mayberry#show frame-relay map ... DLCI - IP mapping is here automatically done by Inverse ARP example 3: same network, no Inverse ARP Now we must make mappings Router A: interface serial 0 frame-relay map ip 199.1.1.2 52 broadcast frame-relay map ipx 199.0200.bbbb.bbbb 52 broadcast frame-relay map ip 199.1.1.3 53 broadcast frame-relay map ipx 199.0200.cccc.cccc 53 broadcast similar for routers B and C 11.2 One IP subnet/IPX network per VC: -------------------------------------- ----- | A dlci 51 140.1.1.0=/ \=140.1.2.0 dlci 52 B C dlci 53 | | --- --- Router A: A(config)#ipx routing 0200.aaaa.aaaa A(config)#interface serial 0 A(config-if)#encapsulation frame-relay A(config-if)#interface serial 0.1 point-to-point A(config-subif)#ip address 140.1.1.1 255.255.255.0 A(config-subif)#ipx network 1 A(config-subif)#frame-relay interface-dlci 52 A(config-fr-dlci)#interface serial 0.2 point-to-point A(config-subif)#ip address 140.1.2.1 255.255.255.0 A(config-subif)#ipx network 2 A(config-subif)#frame-relay interface-dlci 53 A(config-fr-dlci)#interface ethernet 0 A(config-if)#ip address 140.1.11.1 255.255.255.0 A(config-if)#ipx network 11 Router B: B(config)#ipx routing 0200.bbbb.bbbb B(config)#interface serial 0 B(config-if)#encapsulation frame-relay B(config-if)#interface serial 0.1 point-to-point B(config-subif)#ip address 140.1.1.2 255.255.255.0 B(config-subif)#ipx network 1 B(config-if)#frame-relay interface-dlci 51 interface ethernet 0 ip address 140.1.12.2 255.255.255.0 ipx network 13 The 'ipx routing' command enables SAP and RIP. The 'ipx network' command per interface allows to use SAP and RIP on that interface. 11.3 Different frametypes with IPX: ---------------------------------- Novell: Cisco: Ethernet_II ARPA Ethernet_802.3 Novell-ether this is the default Ethernet_802.2 SAP Ethernet_SNAP SNAP Suppose on the Ethernet of Router B, 2 frametypes are used: Ethernet_802.3 and Ethernet_802.2 Router B: B(config)#ipx routing 0200.bbbb.bbbb B(config)#interface serial 0 B(config-if)#encapsulation frame-relay B(config-if)#interface serial 0.1 point-to-point B(config-subif)#ip address 140.1.1.2 255.255.255.0 B(config-subif)#ipx network 1 B(config-if)#frame-relay interface-dlci 51 interface ethernet 0 ip address 140.1.12.2 255.255.255.0 ipx network 13 encapsulation novell-ether ipx network 23 encapsulation sap secondary or use interface ethernet 0.1 ipx network 13 encapsulation novell-ether interface ethernet 0.2 ipx network 23 encapsulation 23 12. Access lists: ================= ip packet -> inbound ACL ->ROUTING -> outbound ACL -> - packets can be filtered as they enter an interface, before routing decision - packets can be filtered before they exit an interface, after routing decision 12.1 Standard IP access list: ----------------------------- Logic: 1. compare matching of the first access-list statement to packet 2. If a match is made, perform permit or deny 3. Or, repeat matching next sequential access-list statements 4. no match, perform deny The standard access list only use the source ip address, or part of the address, to filter traffic. commands: ip access-group 'number' : to bind to an interface access-list 'number' : define the access-list access-class show access-list : shows all access lists show ip access-list : shows the ip access lists show ipx access-list : shows the ipx access lists show ip interface : shows all acl's and interfaces show ipx interface : shows all acl's and interfaces show ip interface ethernet 0 :show all acl's attached to this interface show ipx interface ethernet 0 :show all acl's attached to this interface access-list 'number', where number is 1-100 Wildcards in access-list commands: 0.0.0.0 = complete match ip address 0.0.0.255 = match the first 24 bits 0.0.255.255 = match the first 16 bits 0.255.255.255 = match the first 8 bits 255.255.255.255 = always a match example 1: ---------- RouterA(config)#interface Ethernet0 RouterA(config-if)#ip address 172.16.1.1 255.255.255.0 RouterA(config-if)ip access-group 1 out RouterA(config)#access-list 1 deny 172.16.3.10 0.0.0.0 RouterA(config)#access-list 1 permit 0.0.0.0 255.255.255.255 or the modern equivalent: interface Ethernet0 ip address 172.16.1.1 255.255.255.0 ip access-group 1 access-list 1 deny host 172.16.3.10 access-list 1 permit any example 2: ---------- ----- 10.1.1.0 | A s0 A s1 10.1.128.0=/ \=10.1.130.0 s0/ \s0 B---C B---C |129| s1 s1 10.1.2.0 --- --- 10.1.3.0 x= 10.1.2.1 Suppose: - x not allowed access to 10.1.1.0 - all hosts on 10.1.3.0 not allowed access to 10.1.2.0 - all other combinations are allowed On Router B: interface serial 0 ip access-group 1 access-list 1 deny host 10.1.2.1 access-list 1 permit any On Router C: interface serial 1 ip access-group 1 access-list 1 deny 10.1.3.0 0.0.0.255 access-list 1 permit any 12.2 Extended IP access list: ----------------------------- - access-list "number" where number must be in 100-199 - here you can match on ports, protocols, and other fields in the ip and tcp/udp headers General syntax: ip access-group 'number' : to bind to an interface access-list 'number' : define the access-list access-list number deny|permit protocol source destination RouterA(config)#access-list 101 deny tcp any host 10.1.1.1 eq 23 RouterA(config)#access-list 101 deny tcp any host 10.1.1.1 eq telnet RouterA(config)#access-list 101 deny udp 1.0.0.0 0.255.255.255 lt 1023 any RouterA(config)#access-list 101 deny upd 1.0.0.0 0.255.255.255 lt 1023 44.1.2.3 0.0.255.255 RouterA(config)#access-list 101 deny ip 33.1.2.0 0.0.0.255 44.1.2.3 0.0.255.255 RouterA(config)#access-list 101 deny icmp 33.1.2.0 0.0.0.255 44.1.2.3 0.0.255.255 echo RouterA(config)#access-list 101 deny tcp any host 172.16.30.2 eq 23 log RouterA(config)#access-list 128 deny tcp any 10.55.66.0 0.0.0.255 eq 23 You should follow this with RouterA(config)#access-list 101 permit ip any any 12.3 Named IP access list: -------------------------- numbered: access-list 1-99 permit|deny named: ip access-list standard 'name' permit|deny numbered: access-list 100-199 permit|deny named: ip access-list extended 'name' permit|deny numbered: ip access-group 1-99 in|out named: ip access-group 'name' in|out numbered: ip access-group 100-199 in|out named: ip access-group 'name' in|out Using access-list with vty telnet access config t line vty 0 4 login password cisco access-class 3 in access-list 3 permit 10.1.1.0 0.0.0.255 12.4 IPX standard and extend access lists: ------------------------------------------ Similar to IP access lists IPX has two types of access lists: Standard IPX Access Lists and Extended IPX Access lists. Standard: --------- Standard IPX access lists allow or deny packets based on source and destination IPX addresses. Template to enter standard IPX access lists is as follows: Access-list (number from 800 to 899) (permit or deny) (source network IPX number) (destination network IPX number) Following example will show how the access list will permit or deny access to IPX packets. Router#config t Router(config)#access-list 810 permit 30 10 Router(config)#int e0 Router(config-if)#ipx access-group 810 out Router#config t Router(config)#access-list 810 deny 50 10 Router(config)#int e0 Router(config-if)#ipx access-group 810 out Extended: --------- Extended IPX access lists can filter based on the following: Source network, source node, destination network, destination node, IPX protocol (SAP, SPX etc) and IPX sockets. Template to enter the extended IPX access list is as follows: access-list (number, 900 to 999) permit or deny (protocol) (source IPX network number) (source socket) (destination IPX network number) (destination socket) Following example will show how the extended access list will permit or deny IPX network access using extended access lists Router(config)#access-list 910 deny –1 50 0 10 0 This means that the access is denied to any IPX protocol type from IPX network 50 on all sockets to enter IPX network 10 on all sockets. Access lists: ------------- ipx access-group 'number'|'name' in|out : to bind to an interface ipx input-sap-filter number : to bind a sap filter to an interface ipx output-sap-filter number : to bind a sap filter to an interface access-list 800-899 permit|deny : numbered IPX standard access-list 900-999 permit|deny : numbered IPX extended access-list 1000-1099 permit|deny : numbered SAP access-list ipx access-list standard|extended|sap 'name': named access-list Example 1: ---------- 102 ----- eth1| eth0 R2---|101 /1001 | /s0 |--R1 | \s1 | \1002 200 R3---| eth0|302 At R1: ipx routing 0200.1111.1111 interface serial 0 ip address 10.1.1.1 255.255.255.0 ipx network 1001 ipx access-group 820 in interface serial 1 ip address 10.1.2.1 255.255.255.0 ipx network 1002 interface ethernet 0 ip address 10.1.200.1 255.255.255.0 ipx network 200 ipx access-group 810 access-list 820 deny 101 access-list 820 permit -1 access-list 810 permit 302 Example 2: network wildcard mask -------------------------------- interface serial0 ip address 10.1.1.2 255.255.255.0 ipx network 200 ipx access-group 910 access-list 910 deny any 1000 0000000F access-list 910 permit any -1 13. Cisco switch configuration: =============================== Cisco switch IOS is a bit different compared to the regular router IOS, ofcourse due to the different functions. But for most configuration syntax, they are pretty much alike. Sometimes, a port is called 'interface', but it's really a port. A crossover utp cable must be used to connect a switch to another switch or hub: pin 1 - pin 3 pin 2 - pin 6 example Catalyst 1912 with 12 10BaseT ports: e0/1 - e0/12 2 fastethernet ports fa0/26, fa0/27 s#show running_config s#show spantree s#show vlan_membership s#show vlan s#show vlan 3 s#show ip s#show interfaces s#show mac-address-table s#show mac-address-table security s#show version s#ip address (for inband management, global for switch) s#ip default-gateway s#mac-address-table permanent mac-address port s#mac-address-table restricted static port src-list s#port secure max-mac-count number s#copy nvram tftp:// S#copy tftp:// nvram s#address-violation (suspend|ignore|disable) s#no address-violation s#delete nvram note that with a router, it is R1#erase startup-config nvram is automatically updated when running-config is changed, so there is no 'copy run start' command sample session: to configure a port ----------------------------------- s#config terminal s(config)#ip address 10.5.5.11 255.255.255.0 s(config)#ip default-gateway 10.5.5.3 s(config)#interface e0/1 s(config-if)#duplex half (full, auto, half, full-flow-control) s(config-if)#end s# sample session: to configure restrictions ----------------------------------------- In this example, a server is always on port e0/3 (permanent) and another server is on port e0/4 and only devices on port e0/1 are allowed to send frames to it. s(config)#mac-address-table permanent 0200.2222.2222 e0/3 s(config)#mac-address-table restricted static 0200.1111.1111 e0/4 e0/1 s(config)#end s#show mac-address-table sample session: port security ----------------------------- Port security limits the number of mac addresses associated with a port in the mac address table. Port security can be used to restrict port e0/1 so that only 3 mac addresses can source frames that enter port e0/1 s(config)#mac-address-table permanent 0200.2222.2222 e0/3 s(config)#mac-address-table restricted static 0200.1111.1111 e0/4 e0/1 s(config)#interface ethernet 0/1 s(config-if)#port secure max-mac-count 3 s(config-if)#end s#show mac-address-table security VLAN: ----- A switch creates 1 broadcast domain, but every port is its own collision domain. This is an implicit VLAN 1. VLAN's: - can create n broadcast domains = n VLAN's = n layer 3 groupings - routing is needed between VLAN's - switch let devices in 1 VLAN communicate, but do not forward a frame entering 1 vlan to go to different vlan - seperate address table for each VLAN - interswitch communication between members of the same vlan is done via tagging the frame with an 26 byte ISL header = trunking - trunking with ISL = Cisco, alternative is IEEE 802.1Q Trunking is used between 2 switches, but also between a switch and arouter, if the router supports 'ISL' routing. Then tagged frames can go to and from the router. The router is connected with a trunk link to the sdwitch. How does the router use this. It sees the vlan-id and layer 3 address in the frame. And the router should be configured as in this example: #interface fastethernet 0.1 #ip address 10.1.1.1 255.255.255.0 #encapsulation isl 1 #interface fastethernet 0.2 #ip address 10.1.2.1 255.255.255.0 #encapsulation isl 2 But you can also have multiple router intefaces connect to multiple normal accesslinks on the switches which are in the corresponding VLANS. sample session: creating VLANS ------------------------------ s(config)#vlan 2 name VLAN2 s(config)#vlan 3 name VLAN3 s(config)#interface e 05 s(config-if)#vlan-membership static 2 s(config-if)#interface e 0/6 s(config-if)#vlan-membership static 2 s(config-if)#interface e 0/7 s(config-if)#vlan-membership static 2 .. .. s#show vlan 2 .. To let a VLAN span multiple switches, connect them via fast ethernet ports, and put 'trunking' on. s1(config)#vlan 2 name VLAN2 s1(config)#vlan 3 name VLAN3 s1(config)#interface e 05 s1(config-if)#vlan-membership static 2 s1(config-if)#interface e 0/6 s1(config-if)#vlan-membership static 2 s1(config-if)#interface e 0/7 s1(config-if)#vlan-membership static 2 s1(config-if)#interface e 0/8 s1(config-if)#vlan-membership static 3 s1(config-if)#interface e 0/9 s1(config-if)#vlan-membership static 3 s1(config-if)#interface fa 0/26 s1(config-if)#trunk on s1(config-if)#vlan-membership static 1 s1(config-if)#vlan-membership static 2 s1(config-if)#vlan-membership static 3 s1#show trunk a | b VTP: ---- VLAN trunking protocol: - 1 Domain, 1 VTP Server with VTP clients. - configure VTP Server and clients: s1(config)#vtp server domain abc pruning enable s2(config)#vtp client s1#show vtp 14. Some Examples: ================== Example 1: ---------- Starboss# show running-config Current configuration: ! version 12.0 service timestamps debug uptime service timestamps log uptime no service password-encryption ! hostname Starboss ! enable password cwc ! ! ! ! ! memory-size iomem 15 ip subnet-zero ! frame-relay switching isdn switch-type basic-net3 ! ! process-max-time 200 ! interface FastEthernet0/0 description Starboss RUK LAN ip address 172.17.35.70 255.255.255.0 no ip directed-broadcast ip accounting output-packets speed 100 full-duplex ! interface Serial0/0 bandwidth 128 no ip address no ip directed-broadcast encapsulation frame-relay IETF no ip mroute-cache priority-group 1 cdp enable ! interface Serial0/0.1 point-to-point description 32k PVC to Titan ref:NXPC203765 bandwidth 32 ip address 10.10.35.2 255.255.255.0 no ip directed-broadcast no arp frame-relay frame-relay interface-dlci 100 ! interface BRI0/0 no ip address no ip directed-broadcast encapsulation ppp shutdown dialer map ip 172.17.34.1 02082614099 dialer-group 1 isdn switch-type basic-net3 ! interface Ethernet1/0 description Starboss RPL LAN ip address 172.29.31.30 255.255.255.0 no ip directed-broadcast ! ip classless ip route 0.0.0.0 0.0.0.0 Serial0/0.1 no ip http server ! priority-list 1 protocol ip high tcp telnet description Starboss RPL LAN ip address 172.29.31.30 255.255.255.0 no ip directed-broadcast ! ip classless ip route 0.0.0.0 0.0.0.0 Serial0/0.1 no ip http server ! priority-list 1 protocol ip high tcp telnet dialer-list 1 protocol ip permit snmp-server engineID local 000000090200003094017780 snmp-server community ricoh RO ! line con 0 password cwc transport input none line aux 0 line vty 0 4 password cwc login ! end Starboss# Example 2: ---------- Titan#show running-config Current configuration: ! version 12.0 service timestamps debug uptime service timestamps log uptime no service password-encryption ! hostname Titan ! enable password cwc ! ip subnet-zero ! frame-relay switching ! ! ! interface FastEthernet0/0 description connected to Titan LAN ip address 172.17.30.33 255.255.255.0 no ip directed-broadcast ip accounting output-packets ! interface Serial0/0 description *** LMI to C&W Node HRW/EM1 Fruni 4320 Cct M1181933 NXUK271094 *** bandwidth 256 no ip address no ip directed-broadcast encapsulation frame-relay IETF no ip mroute-cache priority-group 1 ! interface Serial0/0.1 point-to-point description **** 32k Pvc to Starboss S0/0.1 **** bandwidth 32 ip address 10.10.35.1 255.255.255.0 no ip directed-broadcast frame-relay interface-dlci 101 ! interface Serial0/0.2 point-to-point description **** 32k Pvc to Hatton S0/0.1 **** bandwidth 32 ip address 10.10.33.1 255.255.255.0 no ip directed-broadcast frame-relay interface-dlci 102 ! interface Serial0/0.3 point-to-point description **** 32k Pvc to Cornhill S0/0.1 **** bandwidth 32 ip address 10.10.37.1 255.255.255.0 no ip directed-broadcast frame-relay interface-dlci 103 ! ip classless ip route 10.1.7.1 255.255.255.255 172.17.30.22 permanent ip route 133.139.117.53 255.255.255.255 172.17.30.1 permanent ip route 133.139.157.51 255.255.255.255 172.17.30.1 ip route 172.17.0.0 255.255.0.0 172.17.30.1 ip route 172.17.2.209 255.255.255.255 172.17.30.1 permanent ip route 172.17.31.0 255.255.255.0 172.17.30.1 ip route 172.17.32.0 255.255.255.0 172.17.30.1 ip route 172.17.33.0 255.255.255.0 Serial0/0.2 ip route 172.17.35.0 255.255.255.0 Serial0/0.1 ip route 172.17.36.0 255.255.255.0 172.17.30.1 ip route 172.17.37.0 255.255.255.0 Serial0/0.3 ip route 172.17.38.0 255.255.255.0 Null0 ip route 172.29.31.0 255.255.255.0 172.17.35.70 permanent ip route 192.168.174.6 255.255.255.255 172.17.30.1 permanent ip route 172.17.33.0 255.255.255.0 Serial0/0.2 ip route 172.17.35.0 255.255.255.0 Serial0/0.1 ip route 172.17.36.0 255.255.255.0 172.17.30.1 ip route 172.17.37.0 255.255.255.0 Serial0/0.3 ip route 172.17.38.0 255.255.255.0 Null0 ip route 172.29.31.0 255.255.255.0 172.17.35.70 permanent ip route 192.168.174.6 255.255.255.255 172.17.30.1 permanent no ip http server ! priority-list 1 protocol ip high tcp telnet snmp-server engineID local 000000090200003094C14FA0 snmp-server community ricoh RO ! line con 0 password cwc transport input none line aux 0 line vty 0 4 password cwc login ! end Titan# PART 3: OTHER STUFF: ==================== 1. Subnetting ip network: ------------------------- Traditional Classes: A: 1-126 0xxxxxxx.yyyyyyyy.yyyyyyy.yyyyyyyy B: 128-191 10xxxxxx.xxxxxxxx.yyyyyyy.yyyyyyyy C: 192-223 110xxxxx.xxxxxxxx.xxxxxxx.yyyyyyyy D: 224 1110----.--------.-------.-------- Class C subnetting: subnets hosts subnetbits hostbits ----------------------------------------------------------- *255.255.255.128 NA NA 1 7 not valid 255.255.255.192 2 62 2 6 255.255.255.224 6 30 3 5 255.255.255.240 14 14 4 4 255.255.255.248 30 6 5 3 255.255.255.252 62 2 6 2 Class B subnetting: subnets hosts subnetbits hostbits ----------------------------------------------------------- 255.255.128.0 NA NA 1 15 255.255.192.0 2 16382 2 14 255.255.224.0 6 8190 3 13 255.255.240.0 14 4094 4 12 255.255.248.0 30 2046 5 11 255.255.252.0 62 1022 6 10 255.255.254.0 126 510 7 9 255.255.255.0 254 254 8 8 255.255.255.128 510 126 9 7 255.255.255.192 1022 62 10 6 255.255.255.224 2046 30 11 5 255.255.255.240 4094 14 12 4 255.255.255.248 8190 6 13 3 255.255.255.252 16382 2 14 2 PART 4: ISDN: ============= - Reference points -------NT1---- Carrier/ISDN switch T U -------NT2-----NT1---- Carrier/ISDN switch S | T U | -------TA R R1---U---------Provider Router with ISDN card with U interface (NT1) - bri0 R1--S/T--NT1---U---Provider Router with ISDN card with S/T interface (TE1) -bri0 R1--R----TA--S--NT2--T--NT1--U--Provider Router no isdn hardware (TE2) - serial0 - Channels: BRI: 2B+1D, PRI: 23B+1D (US), 30B+1D (Europe) - Standards Telephone network and ISDN - E series example E.163, E.164 ISDN conceps, interfaces - I series example I.100, I.400 Switching and signaling - Q series example Q.921, Q.931 - Signalling, Call setup LAPD is used on D channel between router - ISDN switch HDLC or PPP is used on B channel from end to end, but PPP support control protocols as well as PAP and CHAP Call setup messages refers to both the called and calling SPIDs - router setup for PPP and CHAP Router Fred: username Barney password xyz interface bri 0 ip address 10.3.3.1 255.255.255.0 encapsulation ppp ppp authentication chap Router Barney: username Fred password xyz interface bri 0 ip address 10.3.3.2 255.255.255.0 encapsulation ppp ppp authentication chap ppp multilink -- ppp multilink dialer load-threshold 25 either (in|out|either) ppp multilink -- Configuration Router RouterA#config t RouterA(config)#int bri0 RouterA(config-if)#encapsulation ppp RouterA(config-if)#isdn switch-type 'type' --remote switch type RouterA(config-if)#isdn spid1 086506610100 8650661 RouterA(config-if)#isdn spid2 086506620100 8650662 -- DDR 1. define static routes on the routers involved RouterA(config)#int bri0 RouterA(config-if)#ip address 172.16.60.1 255.255.255.0 RouterA(config-if)#encap ppp RouterA(config)#ip route 172.16.50.0 255.255.255.0 172.16.60.2 RouterA(config)#ip route 172.16.60.2 255.255.255.255 bri0 2. define interesting traffic, or what brings up the isdn line RouterA(config)#dialer-list 1 protocol ip permit RouterA(config)#int bri0 RouterA(config-if)#dialer-group 1 -- binds the access list to bri0 3. define the dialer information, or who must be dialed RouterA(config-if)#dialer-group 1 RouterA(config-if)#dialer string 8350661 or use RouterA(config-if)#dialer map ip 172.16.60.2 name 804B 8350661 This associates an isdn phone number to a next hop router ip address And now define an idle time-out to terminate the connection, and allocate multiple channels at a certain threshold. RouterA(config-if)#dialer load-threshold 125 either RouterA(config-if)#dialer idle-timeout 180 RouterA(config-if)#dialer fast-idle 120 (if more B channels active) 5. Access lists You can limit possible traffic by using an extended access list. For example, permit only email cross the isdn link RouterA(config)#dialer-list 1 list 110 RouterA(config)#access-list 110 permit tcp any any eq smtp RouterA(config)#int bri0 RouterA(config-if)#dialer-group 1 #show interfaces bri 0:1 #show dialer interface bri 0 #show isdn active #show isdn status #debug isdn q921 #debug isdn q931 #debug dialer events #debug dialer packets ============ PART 5: NAT: ============ CISCO NAT: ========== The translation done by NAT can be either static or dynamic. Static translation is where we specify a lookup table, and one inside address is turned into one pre-specified outside address. Dynamic is where we tell the NAT router what inside addresses need to be translated, and what pool of addresses may be used for the outside addresses. There can be multiple pools of outside addresses. ICMP host unreachable messages are used when addresses run out. With NAT, multiple internal hosts can also share a single outside IP address, which conserves address space. This is done by port multiplexing: changing the source port on the outbound packet so that replies can be directed back to the appropriate machine. Address translation is not practical for large numbers of internal hosts all talking at the same time to the outside world. NAT just won't work well at a large scale. Performance may be a consideration. Currently, NAT causes process switching on NAT interfaces on a Cisco 7000. You can think of this as: the CPU has to look at every packet, to decide whether or not to translate it, and to alter the IP header, possibly the TCP header. One doubts that this will be easily cache-able. Configuring NAT: ---------------- Static: ------- Here's a minimal sample configuration for static address translation. We assume Ethernet 0 is "inside" and Serial 0 is "outside". Private network 10.0.0.0 is used inside, and 192.1.1.0 is used outside. We'll translate "10.1.2.3" to "192.1.1.2" (and vice versa). The words "inside source" emphasize that the inside source address is what's getting changed. 10.0.0.0 192.1.1.0 |----------------| --------| |----------------------------- | e0 |----------------|s0 | ------------------------ | | 10.1.2.3 ip nat inside source static 10.1.2.3 192.1.1.2 interface ethernet 0 ip address 10.1.2.1 255.255.255.0 ip nat inside interface serial 0 ip address 192.1.1.1 255.255.255.0 ip nat outside You may add address mappings or inside or outside interfaces as necessary. Dynamic: -------- Let's look at dynamic (pooled) translation. Same network and addresses as before. We'll set up a pool of addresses, translating sources in the range 10.1.2.0 through 10.1.2.255 to the range 192.1.1.10 through 20. The access list indicates what source addresses can be translated. The idea of the third line is that inside source addresses matching list 20 get translated to addresses from the pool named LegalPool. It pretty much says that, doesn't it! ip nat pool LegalPool 192.1.1.10 192.1.1.20 access-list 20 permit 10.1.2.0 0.0.0.255 ip nat inside source list 20 pool LegalPool interface ethernet 0 ip address 10.1.2.1 255.255.255.0 ip nat inside interface serial 0 ip address 192.1.1.1 255.255.255.0 ip nat outside You can configure outside source address translation similarly, changing "inside source" to "outside source" in the above examples. Let's look at how to do static outside address translation, supposing subnet 10.1.5.0 occurs both inside and outside (we're connecting to another company here). We only need to talk to the outside machine 10.1.5.3, and we'll readdress it as private address 192.168.1.1 on the inside (if we use 10.1.5.x, we have more complex routing issues to think about). This might call for something like the following. ip nat outside source static 10.1.5.3 192.168.1.1 interface ethernet 0 ip address 10.1.2.1 255.255.255.0 ip nat inside interface serial 0 ip address 10.1.3.1 255.255.255.0 ip nat outside Examples: --------- Example 1: ========== Define Inside Local and Inside Global Addresses: ------------------------------------------------ A= 10.10.10.1 171.16.68.1 | | | |----------------| | --------------| |----------------------------- >>> e0 |----------------|s0 In the configuration shown, when the NAT router receives a packet on its inside interface with a source address of 10.10.10.1, the source address is translated to 171.16.68.5. This also means that when the NAT router receives a packet on its outside interface with a destination address of 171.16.68.5, the destination address is translated to 10.10.10.1. ip nat inside source static 10.10.10.1 171.16.68.5 !--- Inside device A is known by the outside cloud as 171.16.68.5. interface s 0 ip nat inside interface s 1 ip nat outside Because of the way NAT is configured, the inside addresses are the only addresses that are translated; therefore, the "inside local" address is different from the "inside global" address, while the "outside local" address is the same and the "outside global" address. Define Outside Local and Outside Global Addresses: -------------------------------------------------- In the next configuration, when the NAT router receives a packet on its outside interface with a source address of 171.16.68.1, the source address is translated to 10.10.10.5. This also means that if the NAT router receives a packet on its inside interface with a destination address of 10.10.10.5, the destination address is translated to 171.16.68.1. ip nat outside source static 171.16.68.1 10.10.10.5 !--- Outside device A is known to the inside cloud as 10.10.10.5. interface s 0 ip nat inside interface s 1 ip nat outside In this example, because of the way NAT is configured, only the outside addresses get translated; therefore, the "outside local" address is different from the "outside global" address, while the "inside local" address is the same and the "inside global" address. Define All Local and Global Addresses: -------------------------------------- In the final configuration, when the NAT router receives a packet on its inside interface with a source address of 10.10.10.1, the source address is translated to 171.16.68.5. When the NAT router receives a packet on its outside interface with a source address of 171.16.68.1, the source address is translated to 10.10.10.5. This also means that when the NAT router receives a packet on its outside interface with a destination address of 171.16.68.5, the destination address is translated to 10.10.10.1. Also, when the NAT router receives a packet on its inside interface with a destination address of 10.10.10.5, the destination address is translated to 171.16.68.1. ip nat inside source static 10.10.10.1 171.16.68.5 !--- Inside device A is known to the outside cloud as 171.16.68.5. ip nat outside source static 171.16.68.1 10.10.10.5 !--- device A is known to the inside cloud as 10.10.10.5. interface s 0 ip nat inside interface s 1 ip nat outside Example 2: ========== internal Device A | NAT 10.10.10.1/24 --| e0 ------ |----------| | --| ------ | |s0 172.16.130.2/24 | | | | |172.16.130.1/24 ------- | | OutSide Device A ------- |192.168.1.1/24 | | | | | ------------------------- These commands are configured on the NAT router shown above: ip nat pool test 172.16.131.2 172.16.131.10 netmask 255.255.255.0 ip nat inside source list 7 pool test ip nat inside source static 10.10.10.1 172.16.131.1 interface e 0 ip address 10.10.10.254 255.255.255.0 ip nat inside interface s 0 ip address 172.16.130.2 255.255.255.0 ip nat outside ip route 192.168.1.0 255.255.255.0 172.16.130.1 access-list 7 permit 10.10.10.0 0.0.0.255 The configuration on the OutsideA device is: interface Serial1/0 ip address 172.16.130.1 255.255.255.0 serial restart-delay 0 clockrate 64000 ! interface FastEthernet2/0 ip address 192.168.1.1 255.255.255.0 speed auto half-duplex ip route 172.16.131.0 255.255.255.0 172.16.130.2 The configuration on the InsideA device is: interface Ethernet1/0 ip address 10.10.10.1 255.255.255.0 half-duplex ! ip route 0.0.0.0 0.0.0.0 10.10.10.254 Using the show ip nat translations command, you can see the contents of the translation table: NATrouter#show ip nat translations Pro Inside global Inside local Outside local Outside global --- 172.16.131.1 10.10.10.1 --- --- Example 3: ========== internal Device A | NAT 145.21.32.150/22 --| 145.21.32.89/22 e0 ------ |-------------------------| | --| ------ | |e1 10.x.y.z/24 | | | | |10.x.y.w/24 ------- | | OutSide Device A ------- |10. | | | | | ------------------------- These commands are configured on the NAT router shown above: ip nat pool test 10.x.w.n 10.x.w.m netmask 255.255.255.0 ip nat inside source list 1 pool miskm ip nat inside source static 145.21.32.150 10.x.w.n interface e 0 ip address 145.21.32.89 255.255.248.0 ip nat inside interface e 1 ip address 10.x.y.z 255.255.255.0 ip nat outside ip route 192.168.1.0 255.255.255.0 172.16.130.1 access-list 7 permit 145.21.32.0 0.0.0.255 CISCO PIX NAT: ============== Example 1: ---------- In this tip, administrators can learn how to configure a new PIX firewall, out of the box. You will configure passwords, IP addresses, network address translation (NAT) and basic firewall rules. Let's say that your boss hands you a new PIX firewall. It has never been configured. He says that it needs to be configured with some basic IP addresses, security and a couple of basic firewall rules. You have never used a PIX firewall before. How will you be able to perform this configuration? After reading this article, it should be easy. Let's find out how. -- The basics of a Cisco PIX firewall A Cisco PIX firewall is meant to protect one network from another. There are PIX firewalls for small home networks and PIX firewalls for huge campus or corporate networks. In this example, we will be configuring a PIX 501 firewall. The 501 model is meant for a small home network or a small business. PIX firewalls have the concept of inside and outside interfaces. The inside interface is the internal, usually private, network. The outside interface is the external, usually public, network. You are trying to protect the inside network from the outside network. PIX firewalls also use the adaptive security algorithm (ASA). This algorithm assigns security levels to interfaces and says that no traffic can flow from a lower-level interface (like the outside interface) to a higher-level interface (like the inside interface) without a rule allowing it. The outside interface has a security level of zero and the inside interface has a security level of 100. Here is what the output of the show nameif command looks like: pixfirewall# show nameif nameif ethernet0 outside security0 nameif ethernet1 inside security100 pixfirewall# Notice the ethernet0 interface is the outside interface (its default name) and the security level is 0. On the other hand, the ethernet1 interface is named inside (the default) and has a security level of 100. -- Guidelines: -- ----------- Before beginning the configuration, your boss has given you some guidelines that you need to follow. Here they are: -All passwords should be set to "cisco" (in reality, you make these whatever you want, but not "cisco"). -The inside network is 10.0.0.0 with a 255.0.0.0 subnet mask. The inside IP address for this PIX should be 10.1.1.1. -The outside network is 1.1.1.0 with a 255.255.255.0 subnet mask. The outside IP address for this PIX should be 1.1.1.1. -You want to create a rule to allow all inside clients on the 10.0.0.0 network to do port address translation and connect to the outside network. They will all share the global IP address 1.1.1.2. -However, clients should only have access to port 80 (Web browsing). -The default route for the outside (Internet) network will be 1.1.1.254. 10.0.0.0 / 8 1.1.1.0 / 24 | | | |----------------| | |---------------| --------------| PIX |--------------------------| Router |-------- e1 |----------------|e0 1.1.1.254 |---------------| 10.1.1.1 1.1.1.1 1.1.1.2 -- The configuration: -- ------------------ When you boot up your PIX firewall for the first time, you should see a screen like this: Cannot be shown in a text document, but looks a bit like: ************************************* Copyright (c) 1996-2003 Cisco Systems, Inc. Restricted Rights Legend Use, duplication >>>>>>>>>>>>>>> >>>>>>>> more stuff >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cryptochecksum(changed): d41424 gs6266 e373738 ec52525 Pre-configure PIX Firewall now through interactive prompts [yes]? You will be prompted to answer YES or NO as to whether or not you want to configure the PIX through interactive prompts. Answer NO to this question because you want to learn how to really configure the PIX firewall, not just answer a series of questions. After that, you will be sent to a prompt that looks like this: pixfirewall> With the "greater than" symbol at the end of the prompt, you are in the PIX user mode. Change to privileged mode with the en or enable command. Press "enter" at the Password prompt. Here is an example: pixfirewall> en Password: pixfirewall# You now have administrative mode to show things but would have to go into global configuration mode to configure the PIX. Now, let's move on to basic configuration of the PIX: -- Basic PIX configuration : -- ------------------------ What I am calling basic configuration is made up of three things: Set the hostname Set passwords (login and enable) Configure IP addresses on interfaces Enable interfaces Configure a default route Before you can do any of these things, you need to go into global configuration mode. To do this, type: pixfirewall# config t pixfirewall(config)# To set the hostname, use the hostname command, like this: pixfirewall(config)# hostname PIX1 PIX1(config)# Notice that the prompt changed to the name that you set. Next, set the login password to cisco, like this: PIX1(config)# password cisco PIX1(config)# This is the password required to gain any access to the PIX except administrative access. Now, configure the enable mode password, used to gain administrative mode access. PIX1(config)# enable password cisco PIX1(config)# Now we need to configure IP addresses on interfaces and enable those interfaces. The PIX, unlike a router, has no concept of interface configuration mode. To configure the IP address on the inside interface, use this command: PIX1(config)# ip address inside 10.1.1.1 255.0.0.0 PIX1(config)# Now, configure the outside interface IP address: PIX1(config)# ip address outside 1.1.1.1 255.255.255.0 PIX1(config)# Next, enable both the inside and outside interfaces. Make sure that the Ethernet cable, on each interface, is connected to a switch. Note that the ethernet0 interface is the outside interface, and it is only a 10base-T interface on a PIX 501. The ethernet1 interface is the inside interface, and it is a 100Base-T interface. Here is how you enable these interfaces: PIX1(config)# interface ethernet0 10baset PIX1(config)# interface ethernet1 100full PIX1(config)# Note that you can do a show interfaces command, right from the global configuration prompt line. Finally, let's configure a default route so that all traffic sent to the PIX will flow to the next upstream router (the 1.1.1.254 IP address that we were given). Here is how you do this: PIX1(config)# route outside 0 0 1.1.1.254 PIX1(config)# The PIX firewall can, of course, support dynamic routing protocols as well (such as RIP and OSPF). Now, let's move on to some more advanced configuration. -- Network Address Translation: -- ---------------------------- Now that we have IP address connectivity, we need to use Network Address Translation (NAT) to allow inside users to connect to the outside. We will use a type of NAT, called PAT or NAT Overload, so that all inside devices can share one public IP address (the outside IP address of the PIX firewall). To do this, enter these commands: PIX1(config)# nat (inside) 1 10.0.0.0 255.0.0.0 PIX1(config)# global (outside) 1 1.1.1.2 Global 1.1.1.2 will be Port Address Translated PIX1(config)# With this, all inside clients are able to connect to devices on the public network and share IP address 1.1.1.2. However, clients don't yet have any rule allowing them to do this. -- Firewall rules: -- --------------- These clients on the inside network have a NAT translation, but that doesn't necessarily mean that they are allowed access. They now need a rule to allow them to access the outside network (the Internet). That rule will also allow the return traffic to come back in. To make a rule to allow these clients port 80 (Web browsing), you would type this: PIX1(config)# access-list outbound permit tcp 10.0.0.0 255.0.0.0 any eq 80 PIX1(config)# access-group outbound in interface inside PIX1(config)# Note that PIX access lists, unlike router access lists, use a normal subnet mask, not a wildcard mask. With this access list, you have restricted the inside hosts to accessing Web servers only on the outside network (routers). -- Showing and saving the configuration: -- ------------------------------------- Now that you have configured the PIX firewall, you can show your configuration with the show run command. Make sure that you save your configuration with the write memory or wr m command. If you don't, your configuration will be lost when the PIX is powered off. Example 2: ---------- !--- Sets the outside address of the PIX Firewall: ip address outside 131.1.23.2 !--- Sets the inside address of the PIX Firewall: ip address inside 10.10.254.1 !--- Sets the global pool for hosts inside the firewall: global (outside) 1 131.1.23.12-131.1.23.254 !--- Allows hosts in the 10.0.0.0 network to be !--- translated through the PIX: nat (inside) 1 10.0.0.0 !--- Configures a static translation for an admin workstation !--- with local address 10.14.8.50: static (inside,outside) 131.1.23.11 10.14.8.50 !--- Allows syslog packets to pass through the PIX from RTRA. !--- You can use conduits OR access-lists to permit traffic. !--- Conduits has been added to show the use of the command, !--- however they are commented in the document, since the !--- recommendation is to use access-list. !--- To the admin workstation (syslog server): !--- Using conduit: !--- conduit permit udp host 131.1.23.11 eq 514 host 131.1.23.1 !--- Using access-list: Access-list 101 permit udp host 131.1.23.1 host 131.1.23.11 255.255.255.0 eq 514 Access-group 101 in interface outside !--- Permits incoming mail connections to 131.1.23.10: static (inside, outside) 131.1.23.10 10.10.254.3 !--- Using conduits !--- conduit permit TCP host 131.1.23.10 eq smtp any !--- Using Access-lists, we use access-list 101 !--- which is already applied to interface outside. Access-list 101 permit tcp any host 131.1.23.10 eq smtp !--- PIX needs static routes or the use of routing protocols !--- to know about networks not directly connected. !--- Add a route to network 10.14.8.x/24. route inside 10.14.8.0 255.255.255.0 10.10.254.2 !--- Add a default route to the rest of the traffic !--- that goes to the internet. Route outside 0.0.0.0 0.0.0.0 131.1.23.1 !--- Enables the Mail Guard feature !--- to accept only seven SMTP commands !--- HELO, MAIL, RCPT, DATA, RSET, NOOP, and QUIT: !--- (This can be turned off to permit ESMTP by negating with !--- the no fixup protocol smtp 25 command): fixup protocol smtp 25 !--- Allows Telnet from the inside workstation at 10.14.8.50 !--- into the inside interface of the PIX: telnet 10.14.8.50 !--- Turns on logging: logging on !--- Turns on the logging facility 20: logging facility 20 !--- Turns on logging level 7: logging history 7 !--- Turns on the logging on the inside interface: logging host inside 10.14.8.50 Example 3: ---------- pix outside: 195.73.20.75 / 255.255.255.248 Device A in inside is: 192.168.1.2 / 255.255.255.0 A= 192.168.1.2 195.73.20.75 | | | |----------------| | |---------------| --------------| PIX |--------------------------| ADSL or Cable |-------- e0 |----------------|e1 195.73.20.73 |---------------| ip address outside 195.73.20.75 255.255.255.248 ip address inside 192.168.1.1 255.255.255.0 nat (inside) 1 192.168.1.0 static (inside, outside) 192.168.1.2 195.73.20.75 route outside 0 0 195.73.20.73 1 Example 4: ---------- ip address inside 10.1.1.1 255.255.255.0 ip address outside 209.165.201.1 255.255.255.224 nat (inside) 1 10.1.1.0 255.255.255.0 global (outside) 1 209.165.201.2 netmask 255.255.255.224 static (inside,outside) 209.165.201.3 10.1.1.3 netmask 255.255.255.255 access-list acl_out permit tcp any host 209.165.201.3 eq www aaa authentication include http outside 209.165.201.3 255.255.255.255 0 0 TACACS+ route outside 0 0 209.165.201.4 1 telnet 10.1.1.2 255.255.255.255 In these examples, the ip address commands specify addresses for the inside and outside network interfaces. The ip address command only uses network masks. The inside interface is a Class A address, but only the l ast octet is used in the example network and therefore has a Class C mask. The outside interface i s part of a subnet so the mask reflects the .224 subnet value. The nat command lets users start connections from the inside network. Because a network address is specified, the class mask specified by the ip address inside command is used. The global command provides a PAT (Port Address Translation) address to handle the translated connections from the inside. The global address is also part of the subnet and contains the same mask specified in the ip address outside command. The static command maps an inside host to a global address for access by outside users. Host masks are always specified as 255.255.255.255. The access-list command permits any outside host to access the global address specified by the static command. The host parameter is the same as if you specified 209.165.201.3 255.255.255.255. The aaa command indicates that any users wishing to access the global address must be authenticated. Because authentication only occurs when users access the specified global which is mapped to a host, the mask is for a host. The "0 0" entry indicates any host and its respective mask. The route statement specifies the address of the default router. The "0 0" entry indicates any host and its respective mask. The telnet command specifies a host that can access the PIX Firewall unit's console using Telnet. Because it is a single host, a host mask is used. 2. About the Global command: ---------------------------- [no] global [(if_name)] nat_id {global_ip [-global_ip] [netmask global_mask]} | interface clear global show global The global command defines a pool of global addresses. The global addresses in the pool provide an IP address for each outbound connection, and for those inbound connections resulting from outbound connections. Ensure that associated nat and global command statements have the same nat_id. When used on a PPPoE interface, the global command should explicitly include a netmask. Otherwise, the 255.255.255.255 netmask, assigned to the interface by PPPoE, is used as the broadcast mask. In that case, all addresses in the global pool may become broadcast addresses and will become unusable for address translation. Use caution with names that contain a "-" (dash) character because the global command interprets the last (or only) "-" character in the name as a range specifier instead of as part of the name. For example, the global command treats the name "host-net2" as a range from "host" to "net2". If the name is "host-net2-section3" then it is interpreted as a range from "host-net2" to "section3". The following command form is used for Port Address Translation (PAT) only: global [(if_name)] nat_id {{global_ip} [netmask global_mask] | interface} After changing or removing a global command statement, use the clear xlate command. Use the no global command to remove access to a nat_id, or to a Port Address Translation (PAT) address, or address range within a nat_id. The "show global" command displays the global command statements in the configuration. Examples: global (outside) 1 209.165.201.2 netmask 255.255.255.224 global (outside) 1 209.165.201.1-209.165.201.10 netmask 255.255.255.224 global (outside) 1 interface global (inside) 1 209.165.202.128 netmask 255.255.255.224 PAT You can enable the Port Address Translation (PAT) feature by entering a single IP address with the global command. PAT lets multiple outbound sessions appear to originate from a single IP address. With PAT enabled, the PIX Firewall chooses a unique port number from the PAT IP address for each outbound xlate (translation slot). This feature is valuable when an Internet service provider cannot allocate enough unique IP addresses for your outbound connections. An IP address you specify for a PAT cannot be used in another global address pool. When a PAT augments a pool of global addresses, first the addresses from the global pool are used, then the next connection is taken from the PAT address. If a global pool address is available, the next connection takes that address. The global pool addresses always come first, before a PAT address is used. Augment a pool of global addresses with a PAT by using the same nat_id in the global command statements that create the global pools and the PAT. For example: global (outside) 1 209.165.201.1-209.165.201.10 netmask 255.255.255.224 global (outside) 1 209.165.201.22 netmask 255.255.255.224 More examples: -------------- 1. == Cisco PIX: Allow traffic to an internal host Permit selected traffic to an internal host: First, a static mapping must be made for the host. There is another recipe for this configuration. static (inside,outside) 1.1.1.1 192.168.0.100 netmask 255.255.255.255 then: To allow traffic, a conduit must be constructed. For example, to allow ICMP (ping) traffic to all hosts from anywhere (bad idea): conduit permit icmp any any To allow SSH to a specific host from anywhere: conduit permit tcp host 1.1.1.1 eq 22 any or With ACLs: access-list 100 permit tcp any host 1.1.1.1 22 access-group 100 in interface outside 2. == How to add a static map through a PIX to a device on the inside of your network. A one to one translation. static (inside,outside) (outside IP) (inside IP) netmask 255.255.255.255 Example: static (inside,outside) x.x.x.x x.x.x.x netmask 255.255.255.255 Now you have a static nat to a specific device on the inside of your PIX. You can now write an Access List to specify what services to allow to this device. 3: == Load a new Cisco PIX software image from a TFTP server: TFTP (trivial file transfer protocol) provides a convenient means of quickly transferring a Cisco IOS image to a firewall over an ethernet interface. This procedure is substantially faster than transferring over a serial port. Step 1: Copy the IOS binary file to the TFTP directory. By default on most UNIX systems, the default data directory for the TFTP server is /tftpboot Copy the IOS image file to this directory and make sure it is world readable (i.e., chmod 544 /tftpboot/filename.bin). The first time you try this procedure, or anytime you experience troubles, test the TFTP server configuration with the tftp client: cd /tmp tftp localhost get filename.bin You can change directory to /tmp or any other directory that does not contain the image file. You must use the exact name of your binary file. If there are no error messages, proceed; otherwise troubleshoot based on the error message. Step 2: Configure an ethernet interface on the firewall if not already configured. Test the configuration by pinging the ip address of the TFTP server from the firewall. Step 3: Load the IOS image From enable mode on the firewall, the following command will load the IOS image in filename.bin from the TFTP server at IP address 192.168.200.15: copy tftp://192.168.200.15/filename.bin flash You will be asked to confirm this procedure. Press ENTER to confirm. Step 4: Restart the firewall From enable mode, use the 'reload' command to restart the firewall. ############################################################################################# ############################################################################################# ############################################################################################# ============================================================ Section 11. Basic VMS commands and Operations: ============================================================ 1. The Platform: ================ VMS stand for Virtual Memory System. OpenVMS is not much different to VMS. It was just a marketing name change to reflect the Posix support in VMS. In Alpha AXP the AXP does not stand for anything. It's just that you can't copyright a Greek letter, so DEC added AXP. VMS and/or OPenVMS commonly runs on DEC VAX hardware and Alpha machines. VAX stands for Virtual Address eXtension. The follwing hardware can run VMS: - VAX workstations - small VAXes (MicroVAX I, II, 3000, 4000) - medium VAXes (VAX 6000, VAX 7000, VAX 8000) - big VAXes (VAX 9000, VAX 10000) - ft VAXes - DEC ALPHAs (DEC 2000,3000,4000,7000,10000) - AlphaStation ALPHAs - AlphaServer ALPHAs - AlphaStation XP/DS/ES - AlphaServer DS/ES/GS HP says that OpenVMS will be ported to the Intel Itanium platform, which could have important consequences about the lifetime/lifecycle of OpenVMS. 2. VMS Files and directories: ============================= A VMS file specification consists of three parts: 1 physical or logical device name, like PDS$DISK: 2 directory or sub-directory, like [USER], or [USER.TEX] 3 the file name itself, which has the form: name.type;version where name is an alphanumeric string of up to 40 characters, type is the file type (up to 40 characters), and v is the version number between 1 and 32767, e.g. PROG.EXE;17, or TEST.TXT;1 So, a file completely qualified could be written like PDS$DISK:[USER.TEX]PROG.EXE;17 Thus, complete file specification of a file stored on a disk has the following format: device:[directory.subdirectory]filename.type;version The device and directory part are known as the pathname, and may be prefixed by 'nodename::' if they are on a different computer. Each user has defaults for the device name, and the directory. You may find out your current defaults by typing $ SHOW DEFAULT This shows you the device and directory which VMS will assume if you specify a file name only. You may change the default, separately for the device name and the directory, using $ SET DEF new_default If you enter a file name in any command without items 1 and 2, e.g. simply TEST.TXT , the system will precede the name by the default internally, and take the highest version number of the file. Some default file types: Type Default for Contents COM DCL DCL command file (like Unix script) EXE VMS Executable program image C C Compiler C source CXX CXX Compiler C++ source FOR Fortran Compiler FORTRAN source MAR Macro Assembler VAX Macro Assembler DAT Many things Data file, e.g. program input/output LIS Compilers etc. Informational listing (e.g. compiler output) LOG VMS batch jobs Batch job output log file MAP LINKer Map created by object linker OBJ LINKer Object file (= relocatable binary) OLB LIBRARY Librarian Binary object library PS TEX, DECwrite etc. PostScript laser printer commands TEX TEX Text to be processed by TeX DVI TEX Device-independent output from TeX If you conform to the standard extensions, compiling, linking and running could be as easy as: $ FORTRAN TUT $ LINK TUT $ RUN TUT $ cc program $ link program $ run program 3. ASSIGN command for substituting a logical name for physical name: ==================================================================== You can use a logical device name in a file specifications instead of a physical device, for example like $ ASSIGN/NOLOG DEV$DISK:[HORACE.MCARLO] MONTE$DIR: Files in the DEV$DISK:[HORACE.MCARLO] directory could then be referred to as MONTE$DIR:filename.type;v . This is shorter, and has the advantage that if you move your files for any reason, you need only change one logical name and everything will work in the new directory. Any logical names which you ASSIGN or DEFINE are placed in a separate name table for your process only. These will disappear once you log off, so frequently used logical names should be assigned in your LOGIN.COM command file. To see what logical names are defined for you enter $ SHOW LOGICAL or $ SHOW LOGICAL SY* to see only the logical names beginning with th letters "SY", for example. More examples: $ ASSIGN $DISK1:[CREMERS.MEMOS] MEMOSD The ASSIGN command in this example equates the partial file specification $DISK1:[CREMERS.MEMOS] to the logical name MEMOSD. $ ASSIGN/USER_MODE $DISK1:[FODDY.MEMOS]WATER.TXT TM1 The ASSIGN command in this example equates the logical name TM1 to a file specification. After the next image runs, the logical name is deassigned automatically. 4. MANIPULATING FILES: ====================== Directories: ------------ A directory is a special type of file that contains information about the other files contained within it. The directory part of a file specification is delimited by [] brackets. SET DEFAULT commands specify the "default directory", where files will be read from or written to, unless the filename explicitly specifies a different directory. Nomenclature [] The current directory. [-] One level up. [-.-] Two levels up. [--] Ditto. [...] Everything below the current level. [.*] All subdirectories one level down. SYS$LOGIN The user's login directory. SYS$SCRATCH The user's scratch directory (for large operations). Actions $ CREATE/DIR [.SUBDIR] Create a subdirectory. $ SET DEFAULT [.SUBDIR] Move to this new subdirectory. $ SET DEFAULT PRGDISK:[SHARED.PROGRAMS] Move to this location. $ SET DEFAULT [-] Move to one directory level up. $ SET DEFAULT SYS$LOGIN Move to the user's home directory. To delete a directory, first make all files in it deletable, then remove them: $ SET FILE/PROT=O:RWED [.SUBDIR...]*.*;* $ DELETE [.SUBDIR...]*.*;* Issue this command until no error messages appear. then do: $ SET FILE/PROT=O:RWED SUBDIR.DIR $ DELETE SUBDIR.DIR; DEL, DIR, PURGE, RENAME, SEARCH and file management: ---------------------------------------------------- DIR command: ============ In general, it is useful to look at all your files now and then, using the command $ DIR List everything in the current directory. $ DIR DISK:[DIR1.SUBDIR1] List everything in the specified directory. $ DIR/SIZE/OWNER/PROT FRED*.* List all files beginnig with "FRED" and show their size, who owns them, and what their protections are. $ DIR/SIZ=ALL or DIR/SIZ=ALL filename $ DIRECTORY/SINCE=TODAY/SIZE=ALL $ HELP DIR More information on the DIRECTORY command. DEL command: ============ You can delete a file by typing (after having run the above example) $ DEL TUT.EXE;1 (the ; is always necessary, but the version number may be omitted if you mean the highest version number) which will do it, and tell you so if your LOGIN.COM file defaults are set correctly. You can also use the confirm switch. $ DELETE/CONFIRM PROTO.*;* | USER$DISK:[FAXYZ]PROTO.DAT;2, DELETE? [N]: PURGE command: ============== A useful command is PURGE, e.g. $ PURGE TUT.OBJ (without version number or ; ) which will delete all files TUT.OBJ except the highest version number. A qualifier allows you to specify how many versions to keep, counting from the top, e.g. $ PURGE/KEEP=3 name.type will keep the three highest version numbers of the file specified. RENAME command: =============== Sometimes you want to RENAME a file, like in the following example: $ RENAME VITALFILE.FOR;2 VITALFILE.BAC As you may have guessed by now, no file is ever deleted or replaced by the system, but a higher version is created instead. This makes the PURGE command so necessary if you are to avoid using up all of your disk space allocation, or `disk quota'. SEARCH command: =============== Just like 'find' in DOS or 'grep' in UNIX, DCL allows to find stringvalues in files with the SEARCH command. Just a few examples will provide the general idea: $ SEARCH *.* Fred means find Fred, FRED, fred, etc. in the latest version of any file $ SEARCH/MATCH=EXACT *.*;* "Fred" means find only Fred in any version of any file $ SEARCH *.* search_string means search all files for search_string $ SEARCH *.FOR string which (in the above case) will copy all lines containing "string" from all files of type .FOR to the screen, or onto an output file if you specify the /OUTPUT=filename option after SEARCH. USE OF WILDCARDS: ----------------- Just like in DOS or UNIX, you can use wildcards in filemanagement or listings. Use asterisks "*" and percent signs "%" in connection with file names, where an asterisk stands for "any alphanumeric string" and a "%" for "any alphanumeric character". A command containing this type of name specification is called a WILD card. Examples: $ PURGE *.* or just $ PUR will delete all but the highest version number of all your files (it is good practise to do this from time to time). The command $ DIR *.COM will list all your files of file type COM, $ DIR MY*.%%A will list all your files starting with `MY' and having a file type of three characters, the last one being `A'. Copy command and filetransfer: ------------------------------ When VAX and Alphas are clustered, transfer is trivial. Certain disks on the different machines are available to all members of the cluster, and they have names of the form "node_name$disk", eg. YR9$DKA300, YRL$DKA100, YRE$DKA400. You can see which disks are available using the SHOW DEVICE D command. Example of filetransfer: To copy a file from user MORRISSEY's directory on DEV$DISK to use MARR's directory on PDS$DISK without changing the filename you would enter $ COPY DEV$DISK:[MORRISSEY]TCM.TEX PDS$DISK:[MARR] There is usually no need for this sort of transfer, since you can access all the main disks from every cluster member anyway, though you might want your own copy of a file to modify. When VAX or Alphas are not clustered, but are linked by DECnet as is often the case, file transfer over DECnet is achieved by a simple copy command, which looks like this: $ COPY node_name1::from_file node_name2::to_file Note that the remote file specification must contain the DECnet node name in addition to the disk and directory name. The node name is followed by ::. Example: To copy a file called COMMAND.COM from directory DISK$USERS:[TEST] on a VAX or Alpha called VMS1, to your current default directory you would enter $ COPY VMS1::DISK$USERS:[TEST]COMMAND.COM [] Usually the system managers of the machines you use will have set up what is know as a DECnet proxy entry to allow you to copy easily, as in the above example. If this isn't the case, then you may have to specify your remote username and password when you do the copy. Imagine you were copying a local file called REPORT.TXT to a remote machine, DAKOTA, on which you had an account with username FARGO and password UBETCHAYAH. You place the username and password in quotes, between the username and the :: . $ COPY REPORT.TXT DAKOTA"FARGO UBETCHAYAH"::USER$DISK:[REPORTS] If you were to miss out the USER$DISK:[REPORTS] part of the destination specification, then the file would end up in FARGO's default login directory. It's quite easy to lose track of where files are going during a COPY, so it's a good idea to add the /LOG qualifier to the copy command so that you are told exactly where the file ended up ! File protection or permissions: ------------------------------- Four forms of access to your files may be allowed or denied to four classes of users. - The four access classes are: Read access, Write access, Delete access and Execute access. - The four classes of users are: System manager, Owner, Group members and World (everyone). You may show the current protection of a file by entering a command like the following: $ DIRECTORY/PROTECTION to see something like: Directory USER$DISK:[FAXYZ] DRAFT.TXT;1 124 OCT-31-1988 (RE,RWED,RW,R) The notation in parentheses shows that the system manager is given read and execute access to the file DRAFT.TXT, while the owner retains all four forms of access, members of the same group have read and write access while "world" users (anyone else) has only read access. You may alter the access to the file with the command SET PROTECTION by specifying the access and the filename: $ SET PROTECTION=(S:RW,O:RWE,G:R,W:) DRAFT.TXT;1 5. VMS-UNIX command conversion chart: ===================================== Sometimes it is helpfull to compare some unix commands to VMS commands: UNIX VMS ---- ---- help HELP man command HELP COMMAND ls DIR{ECTORY} ls -ls DIR /OWNER /DATE /SIZE /PROT{ECTION} ls .. DIR [-] ls subdirectory DIR [.SUBDIRECTORY] ls subdir1/subdir2 DIR [.SUBDIR1.SUBDIR2] mkdir subdir CREATE/DIR [.SUBDIR] cd SET DEFAULT SYS$LOGIN cd subdir SET DEF [.SUBDIR] cd ../subdir SET DEF [-.SUBDIR] cp file1 file2 COPY FILE1 FILE2 cp file subdir COPY FILE [.SUBDIR] mv file1 file2 RENAME FILE1 FILE2 rm file DELETE FILE; rmdir subdir SET FILE/PROTECTION=(OWNER:RWED) DEL SUBDIR.DIR; chmod ... filenm SET FILE/PROT=(...) FILENM u O{WNER}: g G{ROUP}: o W{ORLD}: r R w W x E D (DELETE) chmod 755 filenm SET FILE/PROT=(O:RWED,G:RE,W:RE) FILENM command > file command/OUTPUT=FILE command < file command/INPUT=FILE rlogin machine SET HOST MACHINE script {scriptfile} SET HOST MACHINE /LOG{=SCRIPTFILE} (default typescript) (default SETHOST.LOG) vi/edit file ADAM FILE (EDT default editor) EDIT/TPU FILE (programmable editor w/windows) LS{EDIT} FILE (language sensitive editor) cat file TYPE FILE more file TYPE/PAGE FILE cat file1 ... filen > newfile COPY FILE1,...,FILEN NEWFILE cat file1 ... filen >> newfile APPEND FILE1,...,FILEN NEWFILE lno file SEARCH/NUMBER FILE "" lpr file HOTPRINT FILE -or- PRINT FILE lpq SHOW QUEUE grep string file SEARCH FILE STRING sort file > outfile SORT FILE OUTFILE sort file SORT FILE SYS$OUPTUT write user PHONE USER mail user MAIL SEND ps SHOW SYSTEM date SHOW TIME -or- SHO DATE scriptfile @DCLSCRIPT.COM . scriptfile sh < scriptfile source scriptfile alias command 'string' (csh) COMMAND :== STRING (see consultant) alias SHOW SYMBOL/GLOBAL/ALL .login -or- .profile LOGIN.COM stdin SYS$INPUT (VMS logicals) stdout SYS$OUTPUT stderr SYS$ERROR whoami SHOW PROCESS dc -or- bc CALC ^D (logout) LO{GOUT} ^D (EOF) ^Z netcp (unix to unix) NETCOPY (vms to vms) transfer TRANSFER nroff RUNOFF passwd SET PASSWORD 6. DCL Commands, common ones in alphabetical order: =================================================== Common commands: ---------------- Overview This is a list of the commands most likely to be used by nonprivileged users. Actions $ _numeric == 20 Define a symbol that contains a numeric value. $ _symbol :== a string Define a symbol that contains a string. $ append Append one or more files to one file. $ assign Define a logical name. $ attach Transfer control of terminal to a different process. $ backup Make copies of files, directories, disks. $ continue After a {ctrl Y}, let program continue. $ convert Change the format or contents of a file. $ copy Copy a file or files. $ create Create a file. $ create/dir Create a directory. $ deassign Cancel a logical name assignment. $ define Define a logical name. $ delete Delete a file, queue entry, or symbol. $ differences Compare two files, show the differences. $ directory List a directory's contents. $ edit Edit a file. Many editors available. $ ftp Transfer files to/from another computer. $ help Get help on a topic. $ mail Start the MAIL utility, send/read/print/delete mail. $ merge Merge up to 10 presorted files into one. $ monitor Check on disk, processor, etc. usage. $ multinet ping Check the route to another computer. $ phone Interactive conversation with another user. $ posix Enter the POSIX shell (like Unix). $ print Print a file. $ purge Delete lower numbered versions of a file. $ rcp Copy files to/from another computer. $ read Read information from the screen or a file. $ recall Recall previous commands. $ rename Change the name of a file or files. $ rshell Execute commands on another computer. $ run Run a program. $ search Search file(s) for one or more strings. $ set Set many things: terminal, queue entry, priority,etc. $ show Show whatever SET can set. $ sort Sort the contents of a file. $ spawn Create a subprocess. $ stop Stop a process or queue. $ submit Start a batch job. $ talk Interactive conversation with user on another computer. $ telnet Interactive session on another computer. $ type Type a file to the terminal. $ write Send information to the screen or a file. Commands related to devices: ---------------------------- Devices Overview There are many types of devices available on OpenVMS systems, such as disks, tapes, terminals, printers, and so forth. The operating system will assign each of them a name like "DKA100:" (note the trailing colon). One special device is NLA0:, which is the null device. Output directed there disappears - convenient for disposing of status messages and such. Actions $ SHOW DEVICE [/FULL] [device_name] Display information on one or more devices. Disk and tape commands, usually issued by privileged users. $ INIT device_name volume_label Initialize the device. $ MOUNT device_name volume_label Mount the volume in the device on the system. $ DISMOUNT device_name Dismount the volume in the device. Terminal commands, typically anybody can use these. $ SHOW TERM Show terminal characteristics. $ SET TERM [/WIDTH=132] [/PAGE=50] [/SPEED=9600] Change terminal characteristics. 7. COMMAND FILES: ================= Just as in UNIX or DOS, you can create scripts, or batchfiles that run a series of statements. Command files are files containing a series of DCL commands, just as you would enter them from a terminal. They can either be executed interactively, or submitted for batch execution - see the later section on batch jobs. A command file which you have executed already, perhaps without realising it, is your LOGIN.COM . This is executed automatically every time you log in, although you can stop it from being executed (if you have made some sort of mistake in it that causes a loop, say) by adding /NOCOM after your user name when you log in. The default file type for command files is .COM , so if you just type @MYCOMMANDS then VMS will assume that you mean MYCOMMANDS.COM. Each command must be preceeded by a $ sign; lines without this are interpreted as input to procedures called from the command file, and are otherwise skipped, with an error message. Continuation lines are indicated by a "-" (hyphen = minus sign) at the end of the line, and simple continuation in the next line, e.g. $ SHOW - SYSTEM ! This would be a silly place to split a line, but you get the idea Example: Try this ! Create the file TESTCOM.COM and try to predict its action. The exclamation mark is a comment character in DCL, like * in FORTRAN, // in C++. $ CREATE TESTCOM.COM ! Whatever you type now goes into the file TESTCOM.COM $ WRITE SYS$OUTPUT "Hello World" ! Ever original $ EXIT 'Ctrl-Z' The 'Ctrl-Z' terminates the input to the CREATE command. If you $ TYPE TESTCOM.COM It should look like this: $ WRITE SYS$OUTPUT "Hello World" ! Ever original $ EXIT To execute the file, you have to enter $ @TESTCOM 8. DCL SYMBOLS OR VARIABLES: ============================ Symbols are useful for defining shorthand for frequently used commands, and for use as "variables" in DCL command procedures. Using the single "=" sign you can define symbols which are local to a command file, ie. they disappear at exit from it, or you can define global symbols which remain valid until you logoff, using "==". Examples: three = 3 file := SYS$EXAMPLES:TUT.FOR can be used inside the command file, e.g. setting up the files for a batch job. Please note that "=" assigns a value, ":=" assigns a string to a symbol. Placing the string in double quotes " is also acceptable. three == 3 file :== SYS$EXAMPLES:TUT.FOR string1 :==The whole of this line will end up in STRING1 string2 == "This is a string too" All these symbols will however remain valid even after the execution of the command file, because the "==" was used DCL is case-insensitive for the most part, so it doesn't matter whether your symbols are uppercase or lowercase. Having said that I tend to use lowercase for my own symbols, and uppercase for built in DCL commands, just to make it easier to read and tell them apart. To invoke a symbol, put it between quotes, e.g. 'file', as in $ file := MYDATA.DAT ! Local symbol $ COPY 'file' FARVAX::SCRATCH$DEVICE: Note that it is the right-hand single quote ' both before and after the symbol. If you use the symbol within a quoted string, you need two quotes before it and one after, like this: $ file :== TESTCOM.COM ! Global symbol $ WRITE SYS$OUTPUT "Copying ''file' to the remote system." ! Two ' $ COPY 'file' FARVAX::SCRATCH$DEVICE: ! One ' Since symbols can be defined directly, without command files, try the above definition of file followed by the command $ TYPE 'file' and you will understand. Symbols are often used to provide a shorthand way of specifying a frequently used command with several qualifiers. For example, instead of having to type $ DIRECTORY/SINCE=TODAY/SIZE=ALL ! Get all files created today, show their size you could define a symbol in your LOGIN.COM like this: $! Get all files created today, show their size $ SDIR:==DIRECTORY/SINCE=TODAY/SIZE=ALL then you need only type $ SDIR to get the same information. It is considered bad practise to define symbols that clash with built-in DCL commands, because it can lead to all sorts of confusion regarding the expected behaviour of commands. To see what symbols you already have defined you can type $ SHOW SYMBOL * This assumes that someone hasn't defined a symbol called SHOW to do something else ! If you suspect that they have, you can get rid of symbols by typing $ DELETE/SYMBOL symbol_name You could guarantee that DELETE would give you the DCL DELETE functionality by doing $ DELETE:=DELETE and indeed you will occasionally see this done in command files, to insulate them from the effects of any users who have been foolish enough to define symbols that clash with DCL commands. 9. Processes: ============= SHOW PROCESSES: --------------- $ SHOW SYSTEM $ SHOW PROCESS/ALL $ SHOW PROCESS /id=process_id $ SHOW PROCESS process_name which gives information about your process. Normal user priority is 4, but certain system tasks have higher priority, and user batch jobs always have lower priority (1, 2 or 3 for long, medium or normal batch jobs) so that they use up spare CPU time with very little inconvenience to interactive users. Normal users can only set their priority, or that of their batch jobs, up to the base limit of 4. VMS also manages the batch job queues, allows the different VAX and Alphas in the cluster to talk to each other, and many other tasks of this nature. You can tell which processes are hogging which resources using variants of the MONITOR command: $ MONITOR process/topcpu Who's using all the CPU? $ MONITOR process/topfault Who's page faulting so much? $ MONITOR disk What's going on on the disks? CREATE A SUBPROCESS: -------------------- Use the SPAWN command. Here is an example of interrupting a program, creating a subprocess, doing some stuff in it interactively, and then returning to the program running in the main process: $ run myprog ^Y $ spawn $ dir *.dat Do a couple of commands, this is just an example $ logout $ continue The program completes normally. Note that giving a command other than spawn or attach would have killed the halted program "myprog". You can also use Spawn to get a subprocess running at the same time as the main process. For instance, the following will start the program XV (an interactive graphics program for DECwindows) and then let you continue with the current session: $ spawn/nowait xv $ Note that a ^Y or ^C at the top session will kill the subprocess. STOP A PROCESS: --------------- If you know the name or process ID, and it belongs to you, or you have sufficient privileges: $ stop process_name or $ stop/id=process_number a typical number: 20200242 You can get the process_name or _number from: $ show system If the process you want to stop is your current session, or the program you are running, use: {^Y} Control key and Y, stop the current program. $ logout BATCHJOBS: ---------- How do I start a batch job? --------------------------- First you need to put a series of DCL commands into a file, because batch jobs require DCL procedure files to tell them what to do. (They aren't interactive, so you can't do so from your terminal.) Here is a simple procedure file that sorts a couple of files and then merges them. Generally, you would use an editor to create this file. -- just an example 'test.com' file: $! first line of "TEST.COM", note no error checking!!! $ sort file1.txt file1.txt_sorted $ sort file2.txt file2.txt_sorted $ sort file3.txt file3.txt_sorted $ merge file1.txt_sorted,file2,file3 file4.txt $ delete file%.txt. $ write sys$Output "All done" $!last line of file This is one command you might use to start it on a batch queue: $ SUBMIT/NOTIFY/NOPRINT/LOG=SYS$LOGIN: [QUEUE=queue_name] test.com This says: Put it on the batch queue named "queue_name" Notify my terminal when it finishes (only works if you are still logged in!) Keep a log file, it will be SYS$LOGIN:TEST.LOG Don't print the log file It will tell you the entry number when it is placed on the queue. How do I stop a batch job? -------------------------- First, figure out the entry number, if you didn't write it down when you issued the SUBMIT command to place it on the QUEUE. $ SHOW ENTRY Show all entries that you own in any queue. Figure out which one is yours. Then do: $ DELETE/ENTRY=entry_number 10. BOOTPROCEDURE OpenVMS: ========================== Generic description of the bootprocedure: ----------------------------------------- Together, the booting and startup processes comprise the following steps: BOOT command | loads primary bootstrap On VAX, this is VMB.EXE On Alpha this is APB.EXE | loads secondary bootstrap: SYS$SYSTEM:SYSBOOT.EXE | SYSBOOT.EXE loads parameters from default parameter file | loads Executive | loads SWAPPER | loads SYSINIT | starts STARTUP process and executes STARTUP.COM | This will also execute SYSTARTUP_VMS.COM You enter the BOOT command. The boot block, a fixed location on disk, points to the primary bootstrap image, which is loaded from disk into main memory. - On VAX systems, the primary bootstrap image is VMB.EXE. - On Alpha systems, the primary bootstrap image is APB.EXE. The primary bootstrap image allows access to the system disk by finding the secondary bootstrap image, SYS$SYSTEM:SYSBOOT.EXE, and loading it into memory. SYSBOOT.EXE loads the system parameters stored in the default parameter file into memory. If you are performing a conversational boot, the procedure stops and displays the SYSBOOT> prompt. Otherwise, SYSBOOT.EXE loads the operating system executive into memory and transfers control to the executive. When the executive finishes, it executes the SWAPPER process. The SWAPPER creates the SYSINIT process. Among other actions it performs, SYSINIT creates the STARTUP process. STARTUP executes SYS$SYSTEM:STARTUP.COM (unless you indicated another file using SYSMAN, SYSGEN, or conversational boot). STARTUP.COM executes a series of other startup command procedures, including SYSTARTUP_VMS.COM. The current values of system parameters are written back to the default parameter file. The boot process finishes, and you can log in to the operating system. To start an Oracle database automatically when the VMS system is rebooted, place the following commands in a command procedure, e.g. DUA0:[ORACLE7]START_SALES.COM. $ @ORA_ROOT:[DB_SALES]ORAUSER_SALES $ INSORACLE $ @ORA_ROOT:[DB_SALES]STARTUP_EXCLUSIVE_SALES Then edit the system startup file SYS$MANAGER:SYSTARTUP_VMS.COM and add the following command at the end of the file: $ SUBMIT/USER=ORACLE7 DUA0:[ORACLE7]START_SALES 11. ORACLE STUFF ON VMS: ======================== Paragraphs 11.1 & 11.2 deals with the support of Oracle Releases on VAX OpenVMS and Alpha OpenVMS. In short, these are the conclusions: VAX/OpenVMS: max Oracle Server 7.3.x Alpha/OpenVMS: Oracle 7,8,8i,9i 11.1 Supported releases on HP VAX VMS: =============================================================================== Note:52574.1 Subject: HP VAX OpenVMS Certification Matrix Type: FAQ Status: PUBLISHED ORACLE Server ------------- HP VAX OPENVMS -------------- CERTIFICATION MATRIX -------------------- This article lists the historic certification matrix for HP VAX OpenVMS. Version support - Certification Matrix ====================================== As of 1st January 2001, the Oracle Server is no longer fully supported on HP VAX OpenVMS. The upgrade path is to any other supported platform (ie HP ALPHA OpenVMS) via full export/import Oracle Server release 7.3.2.3.1 is the terminal release on VAX. It will remain under Extended Assistance Support until 1st January 2004 (ie for a period of 3 years) Support Matrix for HP VAX OpenVMS versions and Oracle Server releases. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Last Update: 30-AUG-2002 Updated by: Grant Hayden VAX/OPENVMS -------------------------------------------------------------------------- 5.5 6.0 6.1 6.2 7.0 7.1 -------------------------------------------------------------------------- 6.0.37.6 | 6.0.37.6 | 6.0.37.6 | | | 7.0.13.1 | 7.0.13.1 | 7.0.13.1 | | | 7.0.15.4 | 7.0.15.4 | 7.0.15.4 | | | 7.0.16.6.0 | 7.0.16.6.0 | 7.0.16.6.0 | | | 7.0.16.6.2 | 7.0.16.6.2 | 7.0.16.6.2 | | | 7.1.3.2 | 7.1.3.2 | 7.1.3.2 | 7.1.3.2 | | 7.1.3.4 | 7.1.3.4 | 7.1.3.4 | 7.1.3.4 | | 7.1.5.2.4 | 7.1.5.2.4 | 7.1.5.2.4 | 7.1.5.2.4 | | | | | | 7.3.2.3.1 | 7.3.2.3.1 ========================================================================== (For ALPHA OpenVMS Certification Matrix see [NOTE:62150.1] ) NOTES: A. Oracle versions prior to 7.3 are *NOT* supported on OpenVMS 7.x B. Oracle 7.1.5.2.4 will be the LAST release of Oracle which supports V5.5 of VAX/VMS due to the move from VAX 'C' to DEC 'C'. C. Oracle 8 will not be shipped to HP VAX hardware. Hence Oracle Server release 7.3 will be the terminal release of Oracle on the VAX port. D. Oracle 7.3.2.3.1 is the terminal Oracle release on VAX OpenVMS. It will remain fully supported until 01-JAN-2001. 11.2 Supported releases on Alpha Open VMS: =============================================================================== Note:62150.1 Subject: HP Alpha OpenVMS Certification Matrix Type: FAQ Status: PUBLISHED ORACLE Server ------------- HP ALPHA OPENVMS ---------------- CERTIFICATION MATRIX -------------------- This article lists the current certification matrix for HP ALPHA OpenVMS. In Metalink, this note is best viewed using the 'default font'. To change to the 'default font', click on the 'fixed font' text at the top of the screen. Version support - Certification Matrix ====================================== Support Matrix for HP ALPHA OpenVMS versions and Oracle Server releases ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Last Update: 03-JAN-2003 Updated by: Grant Hayden ALPHA/OPENVMS --------------------------------------------------------------------------- | | | | | 6.2 | 7.0 | 7.1 | 7.2 | 7.3 | | | | |** see note J | --------------------------------------------------------------------------- 7.1.5.2.3 | | | | | | 7.3.2.2.0 | | | | | 7.3.2.3.0 | 7.3.2.3.0 | | | | | 7.3.2.3.2 | | | | | 7.3.3.4 | | | | | 7.3.3.6 | 7.3.3.6 | | | | 7.3.4.3 | 7.3.4.3 | | | | 7.3.4.4 | 7.3.4.4 | | --------------------------------------------------------------------------- | | 8.0.3.2.0 | | | | | 8.0.5.0.0 | 8.0.5.0.0 | | | | 8.0.5.0.1 | 8.0.5.0.1 | | | | 8.0.5.1.0 | 8.0.5.1.0 | | | | | 8.1.6.0.0 | | ============================================ ** see note E ================ | | | 8.1.7.0.0 | 8.1.7.0.0 | | | | ** see note E | | | | 8.1.7.1b | 8.1.7.1b | | | | 8.1.7.3.0 | 8.1.7.3.0 | | | | 8.1.7.4.0 | 8.1.7.4.0 | | | | ** see note D | | | | 9.0.1.0.0 | | | | |** see note C | | | | | 9.0.1.3.0 | 9.0.1.3.0 | | | |** see note B | | | | | | 9.2.0.2.0 | | | | |** see Note A | =========================================================================== (For VAX OpenVMS Certification matrix - see [NOTE:52574.1] ) A. Oracle 9 Server Release 9.2.0.2 was announced on 21-DEC-02 Additional information on this release is available under [NOTE:222553.1] FAQ for Oracle RDBMS Release 9.2.0.2.0 for Alpha OpenVMS B. Patch set 9.0.1.3.0 is now available from the Metalink patch download area. Use patch number 2271678 and platform 'HP Alpha OpenVMS' to locate the download. The patch set, like all recent one-off patches, is supplied as a zip file. Please FTP the downloaded file in BINARY mode to your VMS system. The zip file should not be expanded locally on your PC as the subsequent FTP transfer to VMS of the expanded file set will corrupt some of the supplied files and hence make it impossible to apply the patch set correctly. If you have not got the UNZIP utility on VMS, it can be obtained from one of the following locations :- ftp://oracle-ftp.oracle.com/server/patchsets/midrange/alpha/zip/ http://www.info-zip.org/pub/infozip/Zip.html#VMS http://www.openvms.compaq.com/ Then search for UNZIP *IMPORTANT NOTE* - Please review the PATCH_NOTE.HTM file provided with this patch set prior to installation. 9i RAC is only certified against OpenVMS 7.2-1H1 and above and patch 2267002 is required for this certification. This patch is available from the Metalink download area. Use patch number 2267002 and platform 'HP Alpha OpenVMS' to locate the download. C. Oracle 9 Server Release 9.0.1.0.0 is now orderable on OpenVMS. The Server CD kit is shipping under part number A91377-01 This release includes RAC (Real Application Clusters) which is the replacement for the Oracle Parallel Server product under Oracle 8 and earlier releases. Note that this release is only certified against OpenVMS 7.2-1 and 7.2-1H1. Please apply the 9.0.1.3.0 patch set for certification against OpenVMS 7.2-2 and 7.3. (See note B) D. Patch set 8.1.7.4.0 is now available from the Metalink patch download area. Use patch number 2376472, platform 'HP Alpha OpenVMS' to locate the download. The patch set, like all recent one-off patches, is supplied as a zip file. Please FTP the downloaded file in BINARY mode to your VMS system. The zip file should not be expanded locally on your PC as the subsequent FTP transfer to VMS of the expanded file set will corrupt some of the supplied files and hence make it impossible to apply the patch set correctly. If you have not got the UNZIP utility on VMS, it can be obtained from one of the following locations :- ftp://oracle-ftp.oracle.com/server/patchsets/midrange/alpha/zip/ http://www.info-zip.org/pub/infozip/Zip.html#VMS http://www.openvms.compaq.com/ Then search for UNZIP *IMPORTANT NOTE* - Please review the PATCH_NOTE.HTM file provided with this patch set prior to installation. For information: Patch set 8.1.7.3 is still available from the Metalink patch download area. Use patch number 2189751, platform 'HP Alpha OpenVMS' to locate the 8173 download. Patch set 8.1.7.1(b)is still available from the Metalink patch download area. Use patch number 1746764, platform 'HP Alpha OpenVMS' to locate the 8171b download. E. Oracle 8 Server Release 8.1.7.0.0 is now orderable. Note that a minimum of OpenVMS 7.2-1 is required for this release. Current part number A87888-02 includes TG4RDB Previous part number A87888-01 does not include TG4RDB This is the terminal Oracle 8i release. F. Oracle Server releases prior to Oracle Server release 8.1.7 are no longer fully supported. Oracle Server releases 7.3.4, 8.0.5 and 8.1.6 are currently under the Extended Assistance Support (EAS) program. See [NOTE:66697.1] for a definition of this program. EAS ends on 31-DEC-2003 for Oracle Release 7.3.4 ([NOTE:66409.1] ) EAS ends on 30-JUN-2003 for Oracle Release 8.0.5 ([NOTE:72533.1] ) EAS ends on 31-OCT-2004 for Oracle Release 8.1.6 ([NOTE:123178.1] ) Oracle Server releases prior to 8.1.7.0 are not certified to run against OpenVMS 7.3 G. Patch sets and one-off patches are available for download from Metalink. Some historic patches are also available from the following URL. ftp://oracle-ftp.oracle.com/server/patchsets/midrange/alpha/ Please note that patch sets are cumulative and can be applied, unless otherwise stated in the patch set documentation, directly against the Oracle base Release or any intervening patch set version. H. Please note Oracle Server Release 7.3.3.4 and beyond only supports Developer 2000 version 1.6.1. Developer 2000 version 1.3.2 which is available with server release 7.3.2.3.2 can be used against Oracle 7.3.3.x but only when installed under a separate code tree.(ie a different ORA_ROOT). Similarly, Developer 2000 1.6.1 can be used against Oracle 7.3.4.x and Oracle 8 but only when installed under a separate code tree. I. ALPHA OpenVMS Desupport notice for EV5 or earlier systems Please review [NOTE:181307.1] for information on what the minimum hardware requirements will be for running Oracle9i Release 2 J. Oracle releases 8.1.7 and 9.0.1 are now certified against OpenVMS 7.3-1 **Warning** VMS 7.3 Extended File Cache (XFC) may cause data corruption. There is a bug in the VMS 7.3 Extended File Cache (XFC) software that may cause data corruption. XFC is controlled by the sysgen parameter vcc_flags. By default, XFC is enabled in VMS 7.3. The workaround is to set vcc_flags to 1. The OpenVMS ECO VMS73_XFC-V0200 (or later) should be applied to resolve this issue. Contact HP for more information. 11.3 Notes on Oracle Server 7 on OpenVMS: =============================================================================== - LOGICALS: Instead of UNIX or Windows 'ORACLE_SID' environment variable, VMS uses a logical name and the equivalent of the ORACLE_SID is ORA_SID - ORA_ROOT: When Oracle is installed a root directory is chosen which is pointed to by the logical name ORA_ROOT. This directory can be placed anywhere on the VMS system. The majority of code, configuration files and command procedures are found below this root directory. When a new database is created a new directory is created in the root directory to store database specific configuration files. This directory is called [.DB_dbname]. This directory will normally hold the system tablespace data file as well as the database specific startup, shutdown and orauser files. - SYSTEM TABLESPACE: The SYSTEM tablespace will be installed in the ORA_ROOT:[DB_] directory. - USERS ENVIRONMENT: The Oracle environment for a VMS user is set up by running the appropriate ORAUSER_dbname.COM file. This sets up the necessary command symbols and logical names to access the various ORACLE utilities. Each database created on a VMS system will have an ORAUSER file in it's home directory and will be named ORAUSER_dbname.COM, e.g. for a database SALES the file specification could be: ORA_ROOT:[DB_SALES]ORAUSER_SALES.COM To have the environment set up automatically on login, run this command file in your login.com file. Now a user have easy access to for example SQLPLUS using the following command: $ SQLPLUS username/password - END A USER SESSION: You can forcefully end a user session in Oracle in one of two ways: ALTER SYSTEM KILL SESSION from within an Oracle tool or $STOP/ID= - STARTING AND STOPPING A DATABASE: There are several methods available for database startup and shutdown. ORACLEINS (the Oracle install program) and SQLDBA both have menu driven methods to start or stop a database. Alternatively use command files. The following commands will start a database called SALES (the command INSORACLE will install various shared images which improve Oracle performance): $ @ORA_ROOT:[DB_SALES]ORAUSER_SALES $ INSORACLE $ @ORA_ROOT:[DB_SALES]STARTUP_EXCLUSIVE_SALES To start this database automatically when the VMS system is rebooted place these commands in a command procedure, e.g. DUA0:[ORACLE7]START_SALES.COM. Then edit the system startup file SYS$MANAGER:SYSTARTUP_VMS.COM and add the following command at the end of the file: $ SUBMIT/USER=ORACLE7 DUA0:[ORACLE7]START_SALES This will start a batch job running under the Oracle7 user account which will start up the database instance SALES. A database can be shut down by running the command procedure SHUTDOWN_dbname.COM which is found in the database's home directory. 11.4 Global overview installation Oracle 9.2.x on Alpha OpenVMS: =============================================================================== 1. Check memory first: $ SHOW MEMORY $ SHOW MEMORY/RESERVED 2. Check the following: Do you have: 3 GB free diskspace HP OpenVMS 7.3 TCPIP UCX X-windows, needed for running the OUI Check OS version with: $ SHOW SYSTEM/NOPROCESS/FULL Check X-Windows with for example $ RUN SYS$SYSTEM:DECW$CLOCK 3. Check the filesystem: The disk containing the Oracle code tree must use ODS-2 (data) or ODS-5 (software). The logicals ORA_ROOT, ORAROOT_DIR, ORACLE_HOME will point to locations on this disk. Check with: $ SHOW DEVICE/FULL Change structure of disk example: $ SET VOLUME/STRUCTURE_LEVEL=5 $2$DCK100: Format disk example: $ INITIALIZE/STRUCTURE=5 $2$DCK100: TESTVOL 4. Create the Oracle OpenVMS account: $ SET DEFAULT SYS$SYSTEM $ RUN AUTHORIZE UAF>ADD Oracle9 /PASSWORD=ORACLE/UIC=[277,100] - /DEVICE=/DIRECTORY=[Oracle9i]/OWNER="ORACLE DBA" 5. Privileges: A number of privileges needs to be granted to Oracle9 UAF>MODIFY Oracle9 - /PRIVILEGE=(.,.,.,..) -- see manual Install Oracle 9.2.0.2 on OpenVMS: ===================================== Simple example of using the OUI to install Oracle9i Release 2 on an OpenVMS System: =================================================================================== We have a PC running Xcursion and a 16 Processor GS1280 with the 2 built-in disks In the examples we booted on disk DKA0: Oracle account is on disk DKA100. Oracle and the database will be installed on DKA100. Install disk MUST be ODS-5. Installation uses the 9.2 downloaded from the Oracle website. It comes in a Java JAR file. Oracle ships a JRE with its product. However, you will have to install Java on OpenVMS so you can unpack the 9.2 JAR file that comes from the Oracle website Unpack the JAR file as described on the Oracle website. This will create two .BCK files. Follow the instructions in the VMS_9202_README.txt file on how to restore the 2 backup save sets. When the two backup save sets files are restored, you should end up with two directories: [disk1] directory [disk2] directory These directories will be in the root of a disk. In this example they are in the root of DKA100. The OUI requires X-Windows. If the Alpha system you are using does not have a graphic head, use a PC with an X-Windows terminal such as Xcursion. During this install we discovered a problem: Instructions tell you to run @DKA100:[disk1]runinstaller. This will not work because the RUNINSTALLER.COM file is not in the root of DKA100:[disk1]. You must first copy RUNINSTALLER.COM from the dka100:[disk1.000000] directory into dka100:[disk1]: $ Copy dka100:[disk1.000000]runinstaller.com dka100:[disk1] From a terminal window execute: @DKA100:[disk1]runinstaller - Oracle Installer starts Start the installation Click Next to start the installation. - Assign name and directory structure for the Oracle Home ORACLE_HOME Assign a name for your Oracle home. Assign the directory structure for the home, for example Ora_home Dka100:[oracle.oracle9] This is where the OUI will install Oracle. The OUI will create the directories as necessary - Select product to install Select Database. Click Next. - Select type of installation Select Enterprise Edition (or Standard Edition or Custom). Click Next. - Enable RAC Select No. Click Next. - Database summary View list of products that will be installed. Click Install. - Installation begins Installation takes from 45 minutes to an hour. Installation ends Click Exit. Oracle is now installed in DKA100:[oracle.oracle9]. To create the first database, you must first set up Oracle logicals. To do this use a terminal and execute @[.oracle9]orauser . The tool to create and manage databases is DBCA. On the terminal, type DBCA to launch the Database Assistant. Welcome to Database Configuration Assistant DBCA starts. Click Next. Select an operation Select Create a Database. Click Next. Select a template Select New Database. Click Next. Enter database name and SID Enter the name of the database and Oracle System Identifier (SID): In this example, the database name is DB9I. The SID is DB9I1. Click Next. Select database features Select which demo databases are installed. In the example, we selected all possible databases. Click Next. Select default node Select the node in which you want your database to operate by default. In the example, we selected Shared Server Mode. Click Next. Select memory In the example, we selected the default. Click Next. Specify database storage parameters Select the device and directory. Use the UNIX device syntax I.E. For example, DKA100:[oracle.oracle9.database] would be: /DKA100/oracle/oracle9/database/ In the example, we kept the default settings. Click Next. Select database creation options Creating a template saves time when creating a database. Click Finish. Create a template Click OK. Creating and starting Oracle Instance The database builds. If it completes successfully, click Exit. If it does not complete successfully, build it again. Running the database Enter “show system” to see the Oracle database up and running. Set up some files to start and stop the database. Example of a start file This command sets the logicals to manage the database: $ @dka100:[oracle.oracle9]orauser db9i1 The next line starts the Listener (needed for client connects). The final lines start the database. Stop database example Example of how to stop the database. Test database server Use the Enterprise Manager console to test the database server. Oracle Enterprise Manager Enter address of server and SID. Name the server. Click OK. Databases connect information Select database. Enter system account and password. Change connection box to “AS SYSDBA.” Click OK. Open database Database is opened and exposed. Listener Listener automatically picks up the SID from the database. Start Listener before database and the SID will display in the Listener. If you start the database before the Listener, the SID may not appear immediately. To see if the SID is registered in the Listener, enter: $lsnrctl stat Alter a user User is altered: SQL> alter user oe identified by oe account unlock; SQL> exit Preferred method is to use the Enterprise Manager Console. 12. OpenVMS File systems and Diskstructures: ============================================ On-Disk Structure (ODS) refers to a logical structure given to information stored on a disk or CD-ROM. It is a hierarchical organization of files, their data, and the directories needed to gain access to them. The OpenVMS file system implements the On-Disk Structure and provides access control to the files located on the disk. OpenVMS File Structure Options On-Disk Structures include Levels 1, 2, and 5. (Levels 3 and 4 are internal names for ISO 9660 and High Sierra CD formats.) ODS-1 and ODS-2 structures have been available on OpenVMS systems for some time. With OpenVMS Version 7.2 on Alpha systems, you can now specify ODS-5 to format disks as well. ODS-1 Both VAX only; use for RSX compatibility: RSX--11M, RSX--11D, RSX--11M--PLUS, and IAS operating systems. ODS-2 Both Use to share data between VAX and Alpha with full compatibility; default disk structure of the OpenVMS operating system. ODS-5 Both Superset of ODS-2; use on Alpha systems when working with systems like NT that need expanded character sets or deeper directories than ODS-2. ############################################################################################# ############################################################################################# ############################################################################################# ======================================================== Section 12: NT/200x/XP CMD shell script examples: ======================================================== ############################################################################# ############################################################################# Part 1: Traditional old cmd/dos batch command examples DOS/Win9x/NT/200x/XP/Vista ############################################################################# ############################################################################# 1. Put day, month, year into variables: ======================================= @echo off for /f "tokens=2-4 delims=/ " %%a in ('date /t') do ( set mm=%%a set dd=%%b set yyyy=%%c) REM to show these variables echo %mm% echo %dd% echo %yyyy% Or put in a logfile: echo ============== >> c:\temp\report.log echo START RUNTIME: >> c:\temp\report.log echo ============== >> c:\temp\report.log date /T >> c:\temp\report.log time /T >> c:\temp\report.log 2. Some copy and xcopy command examples: ======================================== -- If you want to xcopy files from a certain date: xcopy *.* /D:01-13-2002 f:\backup xcopy *.* /D:%datum% f:\backup -- Some examples of copy commands using variables: copy %NTResKit%\perfmib.dll %systemroot%\system32\perfmib.dll copy %NTResKit%\perfmib.ini %systemroot%\system32\perfmib.ini If you want to use xcopy for backup purposes in Win2Kx / Vista / XP, please see Part 6. 3. The use of "FOR" example: ============================ Example: print all .txt files in 1 command ------------------------------------------ for %f in (*.doc *.txt) do type %f > prn in a batchfile, just use: %%f Example: register some dll's in 1 command ------------------------------------------ for %f in (*.dll) do regsrv32 %f Example: copy tekst into a file a number of times ------------------------------------------------- for /L %%f in (1,1,1000) do echo Albert >> c:\test\test.txt (1,1,1000) means (start,step,end) Example: -------- Or look at this example: FOR /L %variable IN (start,step,end) DO command [command-parameters] To see this in action, at a command prompt, type FOR /L %i in (1,1,5) do @echo %i and you should see: 1 2 3 4 5 Example: sort of unix cut functionality with the use of for: ------------------------------------------------------------ suppose you have the following file "myfile.txt": a,b,c d,e,f g,h,i FOR /F "tokens=2,3* delims=, " %i in (myfile.txt) do @echo %i %j >> myfile2.txt will create the following file "myfile2.txt": b,c e,f h,i Example: @ECHO OFF IF (%1)==() FOR %%v in (GOTO:END ECHO.(%%1):(%1)) do %%v ECHO Got a value :END ECHO The end 4. "If.. Then..Else" test and the use of Labels: ================================================ Example 1: ---------- @echo off setlocal if (%2)==() goto usage sqlplus %1/%2 @%ORACLE_HOME%\sqlplus\demo\demobld.sql goto exit :usage echo Usage: demobld userid passwd :exit endlocal Example 2: ---------- @echo off set test=q if %test%==%1 goto lab2 :lab1 echo not_equal goto end :lab2 echo equal goto end :end if exist c:\temp goto lab2 :lab1 echo bestaat niet goto end :lab2 echo bestaat wel goto end :end Example 3: ---------- Some loose statements: if '%1' == '' goto ERR0 if not exist %SYSTEMROOT%\SYSTEM32\SQRDB3.DLL goto ERR1 if errorlevel 1 set DRV=C: if errorlevel 2 set DRV=D: if %errorlevel% EQU 0 goto GO12 if %errorlevel% GTR 0 goto ERR6 find "not exist" c:\temp\report.log > nul if %errorlevel% EQU 0 goto ERRNAME if "%OS%" == "Windows_NT" goto NT_BIN if exist _runscr.log del _runscr.log > nul Example 4: ---------- if "%OS%" == "Windows_NT" goto NT_OS CALL other.bat EXIT :NT_OS CALL ntlogon.bat EXIT Example 5: ========== IF NOT EXIST TypeFinder\BUILDALL.BAT GOTO TYPEFINDEREND CD TypeFinder CALL BUILDALL.BAT %1 CD .. :TYPEFINDEREND IF NOT EXIST Wintalk\BUILDALL.BAT GOTO WINTALKEND CD Wintalk CALL BUILDALL.BAT %1 CD .. :WINTALKEND IF NOT EXIST WordCount\BUILDALL.BAT GOTO WORDCOUNTEND CD WordCount CALL BUILDALL.BAT %1 CD .. :WORDCOUNTEND Example 6: ========== @echo off csc /t:module CountDownSecondsLabel.cs /r:System.dll /r:System.Windows.Forms.dll /r:System.Drawing.dll rem if C++ is specified, create C++ DLL, otherwise create C# DLL if "C++"=="%1" goto CPP if "c++"=="%1" goto CPP if "%1"=="" goto CS goto ERROR :CS csc /t:module CountDownErrorLabel.cs /r:System.dll /r:System.Windows.Forms.dll /r:System.Drawing.dll goto Continue :CPP cl /clr /LD CountDownErrorLabel.cpp /link /OUT:CountDownErrorLabel.netmodule :Continue ilasm Counter.il /dll ilasm CountDownComponents.il /dll ilasm CountDown.il goto END :ERROR echo Invalid command line argument '%1' echo. :END 5. The use of "Choice": ======================= Choice is an external "cmd" or "DOS box" executable you for example can find in MS Resource kits of Win9x, NT, 2000. Use it as in the following example: echo Please enter the drive letter ( c/d/e/f/g ) choice /c:cdefg if errorlevel 1 set DRV=C: if errorlevel 2 set DRV=D: etc.. echo Is this correct (y/n) ? choice /c:yn /n > nul if errorlevel 2 goto 6. Pipelining examples: ======================= SET | FIND "windir" | IF errorlevel=1 ECHO Windows not running In order to see if Oracle services are running on this machine: net start | FIND "Ora" 7. Creating sub-routines in CMD files without creating new files: ================================================================= With NT/2000 CMD files it's possible to call sub-routines without creating a new CMD file. This gives you, the programmer/scripter, the possibility to keep your scripts in one file and maintain an overview of scripts in use. How does it work then? Well, for those who know the DOS BATCH files (.bat), will remember the LABELS and GOTO commands. Within NT, Microsoft made an addition to this functionality so that you can go to a label, and at the end of youre sub-routine, it will jump back to the point where you have called the label. Just look at the following example: @echo off ECHO Start of part 1 CALL :part2 ECHO End of part 1 goto end :part2 ECHO Start of part 2 ECHO (Some things you want to do) ECHO End of part 2 goto :EOF :end ECHO Finished script The EOF is a hidden label which jumps to the end of the "subroutine", and so returns to its previous caller. 8. Oracle backup scripts partial code: ====================================== Example 1: archivelog backups ----------------------------- @echo off for /f "tokens=2-4 delims=/ " %%a in ('date /t') do ( set mm=%%a set dd=%%b set yyyy=%%c) REM month/day/year mm/dd/yyyy echo %mm% echo %dd% echo %yyyy% set /A lastday=%dd%-1 echo %newday% set copydate=%mm%/%lastday%/%yyyy% echo %copydate% g: cd\archives xcopy *.* /D:%copydate% f:\backup Example 2: maintenance exportfiles ---------------------------------- move /Y d:\backups\pegacc\2dayago\*.Z d:\backups\pegacc\3dayago move /Y d:\backups\pegacc\1dayago\*.Z d:\backups\pegacc\2dayago move /Y d:\backups\pegacc\*.Z d:\backups\pegacc\1dayago move /Y d:\backups\pegtst\2dayago\*.Z d:\backups\pegtst\3dayago move /Y d:\backups\pegtst\1dayago\*.Z d:\backups\pegtst\2dayago move /Y d:\backups\pegtst\*.Z d:\backups\pegtst\1dayago 9. Append date and time to filename: ==================================== Q. How can I append the date and time to a file? A. You can use the batch file below which will rename a file to filename_YYYYMMDDHHMM. @Echo OFF TITLE DateName REM DateName.CMD REM takes a filename as %1 and renames as %1_YYMMDDHHMM REM REM ------------------------------------------------------------- IF %1.==. GoTo USAGE Set CURRDATE=%TEMP%\CURRDATE.TMP Set CURRTIME=%TEMP%\CURRTIME.TMP DATE /T > %CURRDATE% TIME /T > %CURRTIME% Set PARSEARG="eol=; tokens=1,2,3,4* delims=/, " For /F %PARSEARG% %%i in (%CURRDATE%) Do SET YYYYMMDD=%%l%%k%%j Set PARSEARG="eol=; tokens=1,2,3* delims=:, " For /F %PARSEARG% %%i in (%CURRTIME%) Do Set HHMM=%%i%%j%%k Echo RENAME %1 %1_%YYYYMMDD%%HHMM% RENAME %1 %1_%YYYYMMDD%%HHMM% GoTo END :USAGE Echo Usage: DateName filename Echo Renames filename to filename_YYYYMMDDHHMM GoTo END :END REM TITLE Command Prompt Example: D:\Exchange> datetype logfile.log RENAME logfile.log logfile.log_199809281630 10. Output of a program into an environment variable: ===================================================== Q. How can I force the output of a program into an environment variable? A. Some programs return values to the command line and it may be you want these in a variable so they can be viewed/queried by other processes. The easiest way to put the result into an environment variable is to trap it in a FOR statement. For /f "Tokens=*" %i in ('command') do set variable="%i" For example: C:\>For /f "Tokens=*" %i in ('ver') do set NTVersion="%i" C:\>set NTVersion="Windows NT Version 4.0 " C:\>echo %NTVersion% "Windows NT Version 4.0 " If you place the command in a batch file you require two % in front of i, e.g. For /f "Tokens=*" %%i in ('ver') do set NTVersion="%%i" 11. Get m columns from n in a text file: ======================================== Use the unix port freeware program cut.exe. Suppose you have a file x.txt similar to a b c d e f g h i j k l etc.. Now you only want certain columns in a new file. type x.txt | cut 1 3 > y.txt y.txt: a b e f i j 12. SCHEDULING: =============== Example 1: ---------- How to use the "at" command, please see the help given by: C:\> at /? >>> Example of the use of the at command: at 23:00 /every:M,T,W,Th,F backup.cmd That commands schedules the backup.cmd script on your local Server, to be executed at 23:00h at Monday, Tuesday, Wednesday, Thursday and Friday. >>> Other example @echo off rem rem NAME rem setat.cmd - NT command script rem at %1 /every:M,T,W,Th,F,S,Su %COMSPEC% /c "r:\ifa\bin\ifa.cmd" See also Part 3, section 5 13. Delete all files without prompting: ======================================= >> Best solution on NT, 2Kx, XP: --------------------------------- Delete of files, silently, in subdirs, also readonly ones (This is like a "rm -rf" on UNIX) cd %1 del /F /Q /S *.* >> Alternatives on all WinOS: ----------------------------- One of the most Frequently Asked Questions (FAQs) about batches is how to suppress the "Are you sure (Y/N)?" confirmation requirement for del *.*. Use the following: echo y| del *.* If you wish to suppress the message too, use echo y| del *.* > nul There is also another alternative for doing this. It has the advantange of being MS-DOS language version independent. for %%f in (*.*) do del %%f If the directory is empty you can avoid the "File not found" message by applying if exist *.* echo y| del *.* > nul A better, obvious alternative by Rik D'haveloose: if exist *.* for %%f in (*.*) do del %%f 14. Is there an easy way to append a new directory to the path? =============================================================== This often needed trick is basically very simple. For example to add directory %1 to path use path=%path%;%1 Note that you can only use this trick in a batch. It will not work at the MS-DOS prompt because the environment variables are expanded (%path%) only within batches. It also is typical to need a fuller path only for the duration of executing some particular program, and to restore the original after that: @echo off set path_=%path% path=%path_%;f:\ftools :: call whatever :: path=%path_% set path_= 15. Start an installation, tool etc.. ===================================== Example 1: ---------- @echo off REM Oracle Migration Workbench startup script for Windows NT set PATH=E:\Program Files\Oracle\jre\1.1.7\bin\;E:\oracle\ora81\bin;E:\oracle\ora81\Omwb\olite;%PATH% SET JRE=jrew -nojit -mx128m SET NT_START=start REM Starting Oracle Migration Workbench on Windows NT %NT_START% %JRE% -classpath "E:\oracle\ora81\Omwb\olite\Oljdk11.jar;E:\oracle\ora81\Omwb\olite\Olite40.jar;E:\Program Files\Oracle\jre\1.1.7\lib\rt.jar;E:\Program Files\Oracle\jre\1.1.7\lib\i18n.jar;E:\oracle\ora81\Omwb\jlib;E:\oracle\ora81\Omwb\plugins\SQLServer6.jar;E:\oracle\ora81\Omwb\plugins\Sybase.jar;E:\oracle\ora81\Omwb\plugins\MSAccess.jar;E:\oracle\ora81\Omwb\plugins\SQLAnywhere.jar;E:\oracle\ora81\Omwb\plugins\SQLServer7.jar;E:\oracle\ora81\Omwb\jlib\omwb-1_3_0_0_0.jar;E:\oracle\ora81\jdbc\lib\classes111.zip;E:\oracle\ora81\lib\vbjorb.jar;E:\oracle\ora81\jlib\ewt-swingaccess-1_1_1.jar;E:\oracle\ora81\jlib\ewt-3_3_6.jar;E:\oracle\ora81\jlib\ewtcompat-opt-3_3_6.zip;E:\oracle\ora81\jlib\share-1_0_8.jar;E:\oracle\ora81\jlib\help-3_1_8.jar;E:\oracle\ora81\jlib\ice-4_06_6.jar;E:\oracle\ora81\jlib\kodiak-1_1_3.jar" -DORACLE_HOME=E:\oracle\ora81 oracle.mtg.migrationUI.MigrationApp oracle.mtg.migrationUI.MigrationApp Example 2: ---------- set OSQLPATH="c:\Program Files\Microsoft SQL Server\80\Tools\Binn set DBNAME=%2 %OSQLPATH%\osql.exe" -n -S%1 -d %DBNAME% -E -i%3.sql >> _runscr.log 16. Get rid of Carriage return ^M in files: =========================================== How do I eliminate carriage returns (^M) in my files? In unix its simple: ------------------- If you transfer text files from a DOS machine to a UNIX machine, you might see a ^M before the end of each line. This character corresponds to a carriage return. In DOS a newline is represented by the character sequence \r\n, where \r is the carriage return and \n is newline. In UNIX a newline is represented by \n. When text files created on a DOS system are viewed on UNIX, the \r is displayed as ^M. You can strip these carriage returns out by using the tr command as follows: tr -d '\r' < file > newfile or on some unixes: tr -d '\015' < file > newfile Here file is the name of the file that contains the carriage returns, and newfile is the name you want to give the file after the carriage returns have been deleted. Here you are using the octal representation \015 for carriage return, because the escape sequence \r will not be correctly interpreted by all versions of tr. Or you can use sed in the following way: move from unix to dos: $ sed -e 's/$/\r/' myunix.txt > mydos.txt move from dos to unix: $ sed -e 's/.$//' mydos.txt > myunix.txt So, install a unix shell on your PC, like Cygwin But now in dos/nt/2000/xp: -------------------------- (1) get for example Gygwin or other 'unix' emulator engine for nt/2000/xp where you can run tr and sed like commands. (2) with nt/2000/xp tools only: 17: start a file minimised window: ================================== Example: start /min notepad c:\core\cmdshell.txt 18: COMM ports in DOS (dos, NT, 2000, 2003, XP): ================================================ Examples to test a port: 1. echo AT&F>com1 2. C:\>mode com3 Status for device COM3: ----------------------- Baud: 115200 Parity: None Data Bits: 8 Stop Bits: 1 Timeout: OFF XON/XOFF: OFF CTS handshaking: OFF DSR handshaking: OFF DSR sensitivity: OFF DTR circuit: ON RTS circuit: OFF Examples to assign a port: voorbeelden Als u COM12 wilt toewijzen aan COM1, zodat deze kan worden gebruikt door een MS-DOS-toepassing, typt u: change port com12=com1 Met de volgende opdracht geeft u de huidige poorttoewijzingen weer: change port /query 19. Special File and Volume commands in XP: =========================================== fsutil: ------- Fsutil is a command-line utility that you can use to perform many FAT and NTFS file system related tasks, such as managing reparse points, managing sparse files, dismounting a volume, or extending a volume. Because fsutil is quite powerful, it should only be used by advanced users who have a thorough knowledge of Windows XP. In addition, you must be logged on as an administrator or a member of the Administrators group in order to use fsutil. Fsutil: dirty Queries --------------------- Use this to see whether a volume's dirty bit is set, or use it to sets a volume's dirty bit. When a volume's dirty bit is set, autochk automatically checks the volume for errors the next time the computer is restarted. Syntax fsutil dirty {query|set} PathName Parameters -query Queries the dirty bit. -set Sets a volume's dirty bit. -PathName Specifies the drive letter (followed by a colon), mount point, or volume name. Examples - To query the dirty bit on drive C, type: fsutil dirty query C: Sample output: Volume C: is dirty or Volume C: is not dirty - To set the dirty bit on drive C, type: fsutil dirty set C: Fsutil: volume -------------- Us this to manage a volume. Dismounts a volume or queries to see how much free space is available on a disk. Syntax fsutil volume [diskfree] drivename fsutil volume [dismount] VolumePathname Parameters -diskfree Queries the free space of a volume. -drivename Specifies the drive letter (followed by a colon). -dismount Dismounts a volume. -VolumePathname Specifies the drive letter (followed by a colon), mount point, or volume name. Examples - To dismount a volume on drive C, type: fsutil volume dismount C: - To query the free space of a volume on drive C, type: fsutil volume diskfree C: 20. Start a program like a DB sql prompt util and run a script: =============================================================== example.cmd ----------- cls echo off c:\oracle\ora92\bin\sqlplus /nolog @c:\logging\example.sql > z:\its\oc\databases\oracle_logging\example.log So the .cmd file calls a program sqlplus which will run a .sql script, while the output will be placed in a designated logfile. example.sql ----------- The .sql file might contain something like the following: connect system/arcturus81@ECM_172.17.203.162 REM Logon to DB alter system checkpoint REM true DB commands / SELECT * FROM v$sgastat WHERE name = 'free memory' / alter system flush shared_pool / SELECT * FROM v$sgastat WHERE name = 'free memory' / 21. Remote terminal Services: ============================= If your XP, or Server has the terminal services client, or RDP, installed, you can run it via C:\>mstsc A dialog box will show, where you can enter the name or IP of the target system. C:\>mstsc /? Will show all switches you can use. 22. Run a script, or program with elevated credentials: ======================================================= In XP, Vista, Win2Kx you can, as an ordinary user, run a script, or program, with elevated credentials, that is, using another account, using the "runas" utility. - From the prompt, use runas: Syntax RUNAS [/profile] [/env] [/netonly] /user:user Program Key /profile Option to load the user's profile (registry) /env Use current environment instead of user's. /netonly Use if the credentials specified are for RAS only. /user Username in form USER@DOMAIN or DOMAIN\USER (USER@DOMAIN is not compatible with /netonly) Program The command to execute Examples: runas /profile /user:mymachine\administrator CMD runas /profile /env /user:SCOT_DOMAIN\administrator NOTEPAD runas /env /user:jDoe@swest.ss64.com "NOTEPAD \"my file.txt\"" Enter the password when prompted. - From the Windows explorer GUI Select an executable file, Right-click and select Run As.. This option can be hidden by setting HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer Examples: C:\> runas /user:Administrator@afa.com "mycommand.exe" Where you run "mycommand.exe" as the Adminstrator from the Domain "afa". So runas works quite like the unix sudo tool. But the tool will ask for the password of the user listed in the command. If you need encryption of command files, and many other options, checkout the great "runasspc" tool. Just google on runasspc to find more on this usefull tool. 23. Show running services: ========================== C:\> net start Shows all running services on your machine C:\> net start | find "Part of Service name" Shows all services with a name like "Part of Service name" To show all services related to Oracle: C:\> net start | find "Ora" Or use this to show services: C:> cmd /C SC Query>C:\temp\services.txt Lists all your running services to the file services.txt 24. Some systemtools for Windows: ================================= We are not going to differentiate between all possible Windows versions here (like XP,Vista, Win2K3 etc..) but there might be a few additional tools that can be of interest. Ofcourse, everybody knows regedit or regedt32, for viewing or editing the Registry. And, likewise, everybody knows that the Resource Kits deliver you many additional tools for you platform. Besides all that, most Windows versions also have: -- sysedit.exe: It shows you win.ini, system.ini, config.sys and autoexec.bat. The configfiles win.ini and system.ini might still be important for older win applications. -- systeminfo.exe: Its shows you many hardware and system related information. It might also present you a nice list of all the patches and hotfixes that were applied on your system. ####################################################################### ####################################################################### Part 2: Profiles and Loginscripts for clients on Win2Kx Servers. ####################################################################### ####################################################################### 1. Logon scripts in 2000/ 2003: =============================== Note 1: ------- When a client logs on from a Domain member machine, such as a win 2000 professional workstation, or an XP workstation, or a Vista machine, a logon script can be excuted for this user. It can be a kixstart script, or a shell .cmd file, .vbs file, or something else. The use of .cmd files is generally considered as "old fashion", and many people use GPO and even .vbs files. But there is no serious reason not to use a plain old .cmd file. Just put the logon script on the nearest Domain Controller in the following location: %systemroot%\sysvol\sysvol\\SCRIPTS This schould be replicated to other Domain controllers in the tree. >>>> Typical commands in a cmd batch loginscript could be: <<<< -- General drive mappings for all users, like for example: net use u: \\starboss\public net use v: \\starboss\software -- Per user settings, like for example: if "%username%"=="John" if "%username%"=="Mark" etc.. Example: IF /I "%USERNAME%" == "TESTUSR" goto Test :test Here are your commands under the lable test Note 2: ------- Extended note for loginscripts: Creating logon scripts You can use logon scripts to assign tasks that will be performed when a user logs on to a particular computer. The scripts can carry out operating system commands, set system environment variables, and call other scripts or executable programs. The Windows Server 2003 family supports two scripting environments: the command processor runs files containing batch language commands, and Windows Script Host (WSH) runs files containing Microsoft Visual Basic Scripting Edition (VBScript) or Jscript commands. You can use a text editor to create logon scripts. Some tasks commonly performed by logon scripts include: -Mapping network drives. -Installing and setting a user's default printer. -Collecting computer system information. -Updating virus signatures. -Updating software. The following example logon script contains VBScript commands that use Active Directory Service Interfaces (ADSI) to perform three common tasks based on a user's group membership: It maps the H: drive to the home directory of the user by calling the WSH Network object's MapNetworkDrive method in combination with the WSH Network object's UserName property. It uses the ADSI IADsADSystemInfo object to obtain the current user's distinguished name, which in turn is used to connect to the corresponding user object in Active Directory. Once the connection is established, the list of groups the user is a member of is retrieved by using the user's memberOf attribute. The multivalued list of group names is joined into a single string by using VBScript's Join function to make it easier to search for target group names. If the current user is a member of one of the three groups defined at the top of the script, then the script maps the user's G: drive to the group shared drive, and sets the user's default printer to be the group printer. To create an example logon script Open Notepad or other ascii text editor. Copy and paste, or type, the following: Const ENGINEERING_GROUP = "cn=engineering" Const FINANCE_GROUP = "cn=finance" Const HUMAN_RESOURCES_GROUP = "cn=human resources" Set wshNetwork = CreateObject("WScript.Network") wshNetwork.MapNetworkDrive "h:", "\\FileServer\Users\" & wshNetwork.UserName Set ADSysInfo = CreateObject("ADSystemInfo") Set CurrentUser = GetObject("LDAP://" & ADSysInfo.UserName) strGroups = LCase(Join(CurrentUser.MemberOf)) If InStr(strGroups, ENGINEERING_GROUP) Then wshNetwork.MapNetworkDrive "g:", "\\FileServer\Engineering\" wshNetwork.AddWindowsPrinterConnection "\\PrintServer\EngLaser" wshNetwork.AddWindowsPrinterConnection "\\PrintServer\Plotter" wshNetWork.SetDefaultPrinter "\\PrintServer\EngLaser" ElseIf InStr(strGroups, FINANCE_GROUP) Then wshNetwork.MapNetworkDrive "g:", "\\FileServer\Finance\" wshNetwork.AddWindowsPrinterConnection "\\PrintServer\FinLaser" wshNetWork.SetDefaultPrinter "\\PrintServer\FinLaser" ElseIf InStr(strGroups, HUMAN_RESOURCES_GROUP) Then wshNetwork.MapNetworkDrive "g:", "\\FileServer\Human Resources\" wshNetwork.AddWindowsPrinterConnection "\\PrintServer\HrLaser" wshNetWork.SetDefaultPrinter "\\PrintServer\HrLaser" End If On the File menu, click Save As. In Save in, click the directory that corresponds to the domain controller's Netlogon shared folder (usually SystemRoot\SYSVOL\Sysvol\DomainName\Scripts where DomainName is the domain's fully qualified domain name). Note 3: ------- If you want to assign login scripts through a Group Policy Object, go to Active Directory Users and Computers tool, and use GPO, and navigate to "User Config>Windows Settings >Scripts (Logon/Logoff)" Note 4: ------- Simple login script, login.cmd, example for Vista or XP clients on Win2K3 Server: net use u: \\sonne\public net use v: \\sonne\data net use w: \\sonne\software net use t: \\sonne\buro net use s: \\sonne\Backups_Netz_PCs regedit /s pol11.reg regedit /s pol12.reg copy \\sonne\netlogon\crpol.bat c:\temp /Y call c:\temp\crpol.bat regedit /s c:\temp\crpol.reg copy \\sonne\netlogon\message.vbs c:\temp /Y REM copy \\sonne\netlogon\Paths.xcu c:\users\%username%\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Absolutus copy \\sonne\netlogon\Absolutus.xcu c:\users\Absolutus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Alkoholix copy \\sonne\netlogon\Alkoholix.xcu c:\users\Alkoholix\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Ammoniake copy \\sonne\netlogon\Ammoniake.xcu c:\users\Ammoniake\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Appelmus copy \\sonne\netlogon\Appelmus.xcu c:\users\Appelmus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Avantipopulus copy \\sonne\netlogon\Avantipopulus.xcu c:\users\Avantipopulus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Bossix copy \\sonne\netlogon\Bossix.xcu c:\users\Bossix\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Cleopatra copy \\sonne\netlogon\Cleopatra.xcu c:\users\Cleopatra\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Crazfus copy \\sonne\netlogon\Crazfus.xcu c:\users\Crazfus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Gutzufus copy \\sonne\netlogon\Gutzufus.xcu c:\users\Gutzufus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Kontrabas copy \\sonne\netlogon\Kontrabas.xcu c:\users\Kontrabas\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Ofenaus copy \\sonne\netlogon\Ofenaus.xcu c:\users\Ofenaus\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Stenograf copy \\sonne\netlogon\Stenograf.xcu c:\users\Stenograf\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Wachtelchen copy \\sonne\netlogon\Wachtelchen.xcu c:\users\Wachtelchen\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul if %username%==Prognostix copy \\sonne\netlogon\Prognostix.xcu c:\users\Prognostix\AppData\Roaming\OpenOffice.org2\user\registry\data\org\openoffice\office /Y > nul c:\temp\message.vbs 2. The ifmember utility: ======================== This is also batch .cmd related, which is a quite old technique, but it's still possible to use it in Win2Kx login scripts. Although advisable is the use of GPO, that is the Group Policy Editor. IfMember is often used in Windows logon scripts and other batch files. In the following example, the batch file containing IfMember maps a network drive based on group membership. If the user logs on to a computer using this batch file and is a member of the HR group, the batch file maps the following share for the user: \\server1\hr_share$ If the user is a member of the Marketing group, the batch file maps the following share for the user: \\server1\marketing_share$ If the user is a member of the Administrtors group, the batch file maps the following share for the user: \\server1\admin_share$ The share being mapped for the user appears as a network connection in Windows Explorer on the user's computer. The mapped share is assigned the next available drive letter. Batch File echo off ifmember hr if errorlevel 1 goto hr ifmember marketing if errorlevel 1 goto marketing ifmember administrators if errorlevel 1 goto administrators goto end :hr net use * \\server1\hr_share$ goto end :marketing net use * \\server1\marketing_share$ goto end :administrators net use * \\server1\admin_share$ goto end :end Exit In the following example, the batch file containing IfMember queries multiple groups simultaneously. If the user logs on to a computer using this batch file and is a member of the HR or Marketing group, the batch file maps the following share for the user: echo off ifmember hr marketing if errorlevel 2 goto hrANDmarketing if errorlevel 1 goto hrORmarketing ifmember administrators if errorlevel 1 goto administrators goto end :hrANDmarketing net use * \\server1\marketinghr_share$ goto end :hrORmarketing net use * \\server1\standard_share$ goto end :administrators net use * \\server1\admin_share$ goto end :end Exit 3. The use of Runas in a login script: ====================================== See also Part 1, section 22. Ofcourse you can use "runas" in a loginscript (or other batch file), but the standard "runas" utility asks for the password of the user you want "to run as". You can pass the password on the commandline, but in a script this will be cleartext, which might present a security problem. One of the better options here is to take a look at the "runasspc" tool. http://www.robotronic.de/runasspc/ Please take a look at that tool. Its really good, and you are able to store the password in an encrypted file. 4. Some remarks on Vista profiles on Windows Server 2003/2008: ============================================================== Note 1: If you have a corrupt profile, or it does not load correctly from the Server: ===================================================================================== If you have a corrupt Vista or XP profile on a client station, or the userprofile somehow does not load anymore from the Server, you might consider the following: Suppose the profile of Domain User "Alkoholix" does not load correctly to this station, while on another station it works OK. -- On that particular client Workstation, Login as the local administrator, or Domain Admin. -- Optional: If the former local userprofile, might contain data, you should save that first. That might be done like this: >> Now run 'takeown /r /a /d y /f %systemdrive%\users\Alkoholix' >> move Alokoholix to Alkoholix.save -- run regedit and take a look in: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList Remove the suspect corrupt SID -- Login as the Domain User "Alkoholix" Hopefully the userprofile loads correctly now. You might also check the following in the registry of that client: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList There is 1 line for each profile. If you suspect a profile is bad, you can check the following records: -- Ensure the key name doesn't end in ".bad" -- Ensure the RefCount value is 0 -- Ensure the State value is 0 Note 2: XP and Vista clients in the same Domain with roaming profiles: ====================================================================== This can lead to unplesant surprises. Server stored XP and Vista profiles, do not match enough to take for granted that if a user logs on to a Domain from XP, and get's a userprofile loaded, that if he or she goes to a Vista client, that then the same userprofile is used. That is in a standard setup, not the case. If you look at the Server in the directory where the profiles are stored, you might observe the following for, for example, the user "Alkoholix": \\starboss\PROFILES\Alkoholix.V2 (Vista profile) \\starboss\PROFILES\Alkoholix (XP profile) There are many differences between XP and Vista profiles. So what now? This subject is "large" "enough" to redirect you to other places for better information. Some suggestions are: http://4sysops.com/archives/windows-vista-and-windows-xp-roaming-user-profiles-interoperability-folder-redirection-is-the-only-way/ http://technet.microsoft.com/en-us/library/cc766489.aspx ############################################################################################################### ############################################################################################################### Part 3: Some commands in Vista and Windows 2008 Server (some commands were available in older versions as well) ############################################################################################################### ############################################################################################################### 1. Showing and altering settings on network interfaces: The "netsh" command: ============================================================================ Runs on 2000, XP, 2003, 2008 Let's first show some examples on usage: -- Show settings on network interfaces C:\>netsh interface ipv4 show interfaces Idx Met MTU Status Naam --- --- ----- ----------- ------------------- 1 50 4294967295 connected Loopback Pseudo-Interface 1 10 40 1500 connected Draadloze netwerkverbinding 9 30 1500 disconnected LAN-verbinding 12 50 1500 disconnected Bluetooth-netwerkverbinding The "Idx" number identifies your interface if you want to change settings. -- Example of Altering settings C:\> netsh interface ipv4 set address name=10 source=static address=192.168.100.75 mask=255.255.255.0 gateway=192.168.100.1 C:\>netsh interface ipv4 add dnsserver name=2 address=192.168.100.40 -- More on netsh: Netsh.exe is a command-line scripting utility that allows you to, either locally or remotely, display or modify the network configuration of a computer that is currently running. Netsh.exe also provides a scripting feature that allows you to run a group of commands in batch mode against a specified computer. Netsh.exe can also save a configuration script in a text file for archival purposes or to help you configure other servers. Netsh.exe is available on Windows 2000, Windows XP, Windows Server 2003, Vista, Windows Server 2008 . 2. Activate your Windows 2008 Server: ===================================== C:\> slmgr.vbs -ato This is indeed a windows scripting host file, executed by Windows Scripting Host. Activating the remote Windows 2008 Server called "starboss": C:\> slmgr.vbs starboss Administrator -ato -- More on slmgr: SLMGR stands for Software License ManaGeR, or its full name, Windows Software Licensing Management Tool. SLMgr is the main component in Windows Vista (and 2003, 2008) that manages system activation and product key, the license to use Windows. All functions of SLMgr is provided by slmgr.vbs, a command line utility based on VBScript. Most activation related commands available in graphics user interface such as System Properties will call slmgr.vbs VBS script to perform the licensing operation. And even if you trigger or run SLMgr commands in command line, the results or any error details will display in pop-up dialog window in Vista explorer. Here’s some hack and usage guide for slmgr in Vista,a useful reference when you facing activation or not activated problem, or when you have been force into Reduced Functionality Mode. Where and How to Use SLMgr.vbs There are several ways actually to access and run SLMgr.vbs commands. Command prompt window. This is the way to to run SLMgr with options which requires elevated administrator privileges. Run command (Guide: Display Run command in Vista). Start Search box integrated in the Start Menu. Using this method will require user to type in full script name - SLMgr.vbs into the search box so that the command looks like “slmgr.vbs -ato” and etc. The most famous and common use of SLMgr is to perform a “slmgr.vbs -rearm” to extend the trial period of Vista for another 30 days. However, other than this popular switch, SLMgr.vbs actually supports a list of other options, which you can also view by using “SLMgr.vbs -?” command. You will see a result window displayed at below. SLMgr Usage Syntax slmgr.vbs [MachineName [User Password]] [