.. _getting_started:
***************
Getting started
***************
.. _installing-rocks
Installing Rocks on the frontend - node (q.rz-berlin.mpg.de, 141.14.128.18)
Access to iDrac service interface and boot up the kernel iso
============================================================
Access via http to 141.14.128.17 (q-sp.rz-berlin.mpg.de)
(q-sp manualy set to this address, first assignment via dhcp)
Initial password can be found at extendable label.
Root password changed to PP&B default remote access password.
Launch the console. (on Mac don't forget to allow java running)
Map kernel.iso (from http://www.rocksclusters.org/downloads.html) as DVD.
Boot up (warm boot, booting from DVD)
Configuring the system
======================
Instructions see http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/install-frontend-7.html
Name: q.fhi-berlin.mpg.de
Public net (FHI) on em2 (10 Gbit/s) with IP 141.14.128.18/20
Gateway: 141.14.128.128, Nameserver: 141.14.128.1, Search domains:
fhi-berlin.mpg.de, rz-berlin.mpg.de
Private net on p7p1 (The IP 10.1.1.1 gets choosen by the system)
Disk setup should include RAID System with 10 TByte as home. Two Raids
should be present one Raid1, one Raid15 (the big one)
Rocks rolls all except fingerprint, htcondor?
RAID 1 with 2 x 150 GByte SSDs as boot disk.
RAID6x -> 11 TiB, 10 TiB for /home, 1 TiB for /export (used for /share/apps)
Must be configured manualy.
Install, will take 2-3 hrs.
Update rocks
============
Leads to trouble with unresolved stuff::
baseurl=http://ftp.fau.de/centos/
osversion=7.4.1708
version=`date +%F`
rocks create mirror ${baseurl}/centos/${osversion}/updates/x86_64/Packages/ rollname=Updates-CentOS-${osversion} version=${version}
rocks add roll Updates-CentOS-${osversion}-${version}*iso
rocks enable roll Updates-CentOS-${osversion} version=${version}
(cd /export/rocks/install; rocks create distro)
yum clean all; yum update
Prepare the iDracs for the compute nodes
========================================
.. note::
All the scripts can be found on http://hg.rz-berlin.mpg.de/qSetup
(yum install mercurial)
To give IP's to the iDrac interfaces of the compute nodes, DHCP must be
setup for the management net.
To add a management net, the mangement switches (1 Gbit/s, q-msw-01, q-msw-02) must be
configured (link?) and connected to iDrac's of all nodes including frontend.
On the frontend a network must be created for this mgmt-net and a interface
must be dedicated to it::
rocks add network mgmt subnet=10.0.12.0 netmask=255.255.255.0
rocks set host interface ip q iface=em3 ip=10.0.12.1
rocks set host interface subnet q iface=em3 subnet=mgmt
rocks set host interface name q iface=em3 name=q-mgmt
rocks sync config
rocks sync host network q
rocks list network
Now DHCP for these hosts must be included. As the rocks distro creates via
kickstart dhcp entries the python rocks system file must be altered::
vi /opt/rocks/lib/python2.7/site-packages/rocks/commands/report/host/dhcpd/__init__.py
add "em3" to the DHCPARGS : self.addOutput('', 'DHCPDARGS="%s em3"' % device)
add self.addOutput('', 'include "/root/FHI/mgmt.dhcp";')
just before self.addOutput('', '</file>')
Now one have to create this file with the mac addresses of the iDracs. The
addresses can be found on the extensable label on the front of the node::
[root@q FHI]# cat mgmt.dhcp
subnet 10.0.12.0 netmask 255.255.255.0 {
default-lease-time 1200;
max-lease-time 1200;
option routers 10.0.12.1;
option subnet-mask 255.255.255.0;
option domain-name "mgmt";
option domain-name-servers 10.0.12.1;
option broadcast-address None;
option interface-mtu 1500;
group "mgmt" {
host mgmt-q {
# Frontend hardware
ethernet 24:6e:96:79:7c:46;
fixed-address 10.0.12.1;
}
host sp-compute-0-0 { # iDRAC-BDWKGM2
hardware ethernet d0:94:66:27:7b:5e;
fixed-address 10.0.12.10;
}
host sp-compute-0-1 { # iDRAC-BDWHGM2
hardware ethernet d0:94:66:28:47:cd;
fixed-address 10.0.12.11;
}
host sp-compute-0-2 { # iDRAC-BDW9GM2
hardware ethernet d0:94:66:2c:0d:e2;
fixed-address 10.0.12.12;
}
host sp-compute-0-3 { # iDRAC-BDWGGM2
hardware ethernet d0:94:66:20:2a:34;
fixed-address 10.0.12.13;
}
host sp-compute-0-4 { # iDRAC-BDRGGM2 hardware ethernet d0:94:66:1f:3f:cc; fixed-address 10.0.12.14; }
host sp-compute-0-5 { # iDRAC-BDTKGM2 hardware ethernet d0:94:66:28:61:99; fixed-address 10.0.12.15; }
host sp-compute-0-6 { # iDRAC-BDXBGM2 hardware ethernet d0:94:66:27:62:39; fixed-address 10.0.12.16; }
host sp-compute-0-7 { # iDRAC-BDVCGM2 hardware ethernet d0:94:66:2c:0c:4a; fixed-address 10.0.12.17; }
host sp-compute-0-8 { # iDRAC-BDT9GM2 hardware ethernet d0:94:66:2b:ff:4f; fixed-address 10.0.12.18; }
host sp-compute-0-9 { # iDRAC-BDVDGM2 hardware ethernet d0:94:66:27:52:a2; fixed-address 10.0.12.19; }
host sp-compute-0-10 { # iDRAC-BDVJGM2 hardware ethernet d0:94:66:27:48:c2; fixed-address 10.0.12.20; }
host sp-compute-0-11 { # iDRAC-BDRDGM2 hardware ethernet d0:94:66:1f:42:46; fixed-address 10.0.12.21; }
host sp-compute-0-12 { # iDRAC-BDWFGM2 hardware ethernet d0:94:66:27:5b:d6; fixed-address 10.0.12.22; }
host sp-compute-0-13 { # iDRAC-BDSBGM2 hardware ethernet d0:94:66:20:29:54; fixed-address 10.0.12.23; }
host sp-compute-0-14 { # iDRAC-BDSDGM2 hardware ethernet d0:94:66:20:28:3a; fixed-address 10.0.12.24; }
host sp-compute-0-15 { # iDRAC-BDRJGM2 hardware ethernet d0:94:66:1f:53:f7; fixed-address 10.0.12.25; }
host sp-compute-0-16 { # iDRAC-BDWDGM2 hardware ethernet d0:94:66:2c:0e:0a; fixed-address 10.0.12.26; }
host sp-compute-0-17 { # iDRAC-BDWBGM2 hardware ethernet d0:94:66:28:45:62; fixed-address 10.0.12.27; }
host sp-compute-0-18 { # iDRAC-BDWJGM2 hardware ethernet d0:94:66:20:2b:be; fixed-address 10.0.12.28; }
host sp-compute-0-19 { # iDRAC-BDWGGM2 hardware ethernet d0:94:66:20:2a:34; fixed-address 10.0.12.29; }
}
Restart dhcpd services by::
service dhcpd restart
-> Redirecting to /bin/systemctl restart dhcpd.service
Check (Default password for iDrac = calvin):
[root@q log]# ssh root@10.0.12.10
root@10.0.12.10's password:
Network cable connected to B (left, near to PCI-X bus)
We want to do PXE from 10Gbit Interface on PCIx card X710::
/admin1-> racadm get NIC.NICConfig
NIC.NICConfig.1 [Key=NIC.Slot.2-1-1#NICConfig]
NIC.NICConfig.2 [Key=NIC.Slot.2-2-1#NICConfig]
NIC.NICConfig.3 [Key=NIC.Embedded.1-1-1#NICConfig]
NIC.NICConfig.4 [Key=NIC.Embedded.2-1-1#NICConfig]
/admin1-> racadm set NIC.NICConfig.2.LegacyBootProto PXE
[Key=NIC.Slot.2-2-1#LegacyBootProto]
RAC1017: Successfully modified the object value and the change is in
pending state.
To apply modified value, create a configuration job and reboot
the system. To create the commit and reboot jobs, use "jobqueue"
command. For more information about the "jobqueue" command,
see RACADM help.
/admin1-> racadm jobqueue create NIC.Slot.2-2-1
RAC1024: Successfully scheduled a job.
Verify the job status using "racadm jobqueue view -i JID_xxxxx" command.
Commit JID = JID_168281383887
/admin1-> racadm serveraction powercycle
/admin1-> racadm set BIOS.BiosBootSettings.BootSeq NIC.Slot.2-2-1
[Key=BIOS.Setup.1-1#BiosBootSettings]
RAC1017: Successfully modified the object value and the change is in
pending state.
To apply modified value, create a configuration job and reboot
the system. To create the commit and reboot jobs, use "jobqueue"
command. For more information about the "jobqueue" command, see RACADM
help.
/admin1-> racadm get BIOS.BiosBootSettings.BootSeq
[Key=BIOS.Setup.1-1#BiosBootSettings]
BootSeq=NIC.Embedded.1-1-1,NIC.Slot.2-1-1
(Pending Value=NIC.Slot.2-1-1,NIC.Embedded.1-1-1)
/admin1-> racadm jobqueue create BIOS.Setup.1-1
RAC1024: Successfully scheduled a job.
Verify the job status using "racadm
jobqueue view -i JID_xxxxx" command.
Commit JID = JID_168368767313
/admin1-> racadm jobqueue view -i JID_168368767313
---------------------------- JOB -------------------------
[Job ID=JID_168368767313]
Job Name=Configure: BIOS.Setup.1-1
Status=Scheduled
Start Time=[Now]
Expiration Time=[Not Applicable]
Message=[JCP001: Task successfully scheduled.]
Percent Complete=[0]
----------------------------------------------------------
/admin1-> racadm serveraction powercycle
To get rid of opensm log entries::
/bin/systemctl disable opensm
Could not start insart-ethers. httpd was not running. Had to create
/run/httpd for apache:apache
Then https could be started with "service httpd start"
Now start insert-ethers but this was just a test how to deal with the iDrac.
Put root's ssh key to all iDracs.
Create FHI/idracSSHKey::
racadm sshpkauth -i 2 -k 1 -t "ssh-rsa AAAA...root@q.fhi-berlin.mpg.de"
Key taken from /root/.ssh/id_rsa.pub.
Create FHI/setInitSSHKeyToIdracs::
#!/bin/bash
for ip in 10.0.12.{10..29}
do
echo "connect to $ip you will be asked for a password (if its a new key) ->
calvin"
ssh $ip < idracSSHKey
done
.. _update-cluster-software
Update Cluster software
=======================
Must be done with yum::
yum clean all
rm -rf /var/cache/yum
yum --enablerepo=updates check-update
yum --enablerepo=updates update
Now the new packages should be copyied to rocks install contrib. But the
source dir seems not to exists::
cp /var/cache/yum/x86_64/7/updates/packages/* /export/rocks/install/contrib/7.0/x86_64/RPMS/
fails. Wait for info from mail list.
Activate ldap authentication
----------------------------
Senseless as gid and uid must be offseted by 1000...
Activate sssd (should be better than nscd)::
yum install -y sssd
yum downgrade sssd-client
yum downgrade libsss_idmap
yum install -y sssd
authconfig --enableldap --enableldapauth --ldapserver="ldap.rz-berlin.mpg.de" --ldapbasedn="ou=people,dc=ppb,dc=rz-berlin,dc=mpg,dc=de" --update --enablemkhomedir
yum install c-ares-devel
authconfig --enableldap --enableldapauth
--ldapserver="ldap.rz-berlin.mpg.de"
--ldapbasedn="ou=people,dc=ppb,dc=rz-berlin,dc=mpg,dc=de" --update
--enablemkhomedir
systemctl stop sssd.service
systemctl start sssd.service
systemctl status sssd.service
Software Install
================
Intel compiler 2016.4 (Gert told me) and
Intel compiler 2018.1
Download intel License Manager
Intel 2018.1 with PGI support!
Needs 32bit libs::
yum install libstdc++-devel.i686
yum install glibc-devel.i686
yum install libgcc.i686 (already installed)
image:: _static/basic_screenshot.png
Now we will start to customize out docs. Grab a couple of files from
the `web site <https://github.com/matplotlib/sampledoc>`_
or git. You will need :file:`getting_started.rst` and
:file:`_static/basic_screenshot.png`. All of the files live in the
"completed" version of this tutorial, but since this is a tutorial,
we'll just grab them one at a time, so you can learn what needs to be
changed where. Since we have more files to come, I'm going to grab
the whole git directory and just copy the files I need over for now.
First, I'll cd up back into the directory containing my project, check
out the "finished" product from git, and then copy in just the files I
need into my :file:`sampledoc` directory::
home:~/tmp/sampledoc> pwd
/Users/jdhunter/tmp/sampledoc
home:~/tmp/sampledoc> cd ..
home:~/tmp> git clone https://github.com/matplotlib/sampledoc.git tutorial
Cloning into 'tutorial'...
remote: Counting objects: 87, done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 87 (delta 45), reused 83 (delta 41)
Unpacking objects: 100% (87/87), done.
Checking connectivity... done
home:~/tmp> cp tutorial/getting_started.rst sampledoc/
home:~/tmp> cp tutorial/_static/basic_screenshot.png sampledoc/_static/
The last step is to modify :file:`index.rst` to include the
:file:`getting_started.rst` file (be careful with the indentation, the
"g" in "getting_started" should line up with the ':' in ``:maxdepth``::
Contents:
.. toctree::
:maxdepth: 2
getting_started.rst
and then rebuild the docs::
cd sampledoc
make html
When you reload the page by refreshing your browser pointing to
:file:`_build/html/index.html`, you should see a link to the
"Getting Started" docs, and in there this page with the screenshot.
`Voila!`
Note we used the image directive to include to the screenshot above
with::
.. image:: _static/basic_screenshot.png
Next we'll customize the look and feel of our site to give it a logo,
some custom css, and update the navigation panels to look more like
the `sphinx <http://sphinx.pocoo.org/>`_ site itself -- see
:ref:`custom_look`.
???END