Tuesday, May 23, 2017

They joy of replacing a local disk with SVM

They joy of replacing a local disk with SVM

We still have some old Sun Fire V245 server here and I had the pleasure to replace a failed disk today.

The server is still running Solaris 10 and using SVM to mirror the root disks.

What's the failed disk in question?

# iostat -En
c0t0d0           Soft Errors: 3039 Hard Errors: 75 Transport Errors: 14
Vendor: FUJITSU  Product: MAY2073RCSUN72G  Revision: 0501 Serial No: xxxxxxxxxx
Size: 73.41GB <73407865856 bytes>
Media Error: 64 Device Not Ready: 0 No Device: 11 Recoverable: 3039
Illegal Request: 2 Predictive Failure Analysis: 12

Nasty, let's configure it out.

# disk=c0t0d0
# metastat -p > /etc/lvm/md.tab
# grep $disk /etc/lvm/md.tab
d21 1 1 c0t0d0s1
d11 1 1 c0t0d0s0
# metadetach d10 d11
d10: submirror d11 is detached
# metadetach d20 d21
metadetach: solaris: d20: attempt an operation on a submirror that has erred components

Ooops, I hope the force is still with me...

# metastat d20
d20: Mirror
    Submirror 0: d21
      State: Needs maintenance
    Submirror 1: d22
      State: Okay
...
# metadetach -f d20 d21
d20: submirror d21 is detached

I had such anger, but it's all good now. Let's continue.

# metaclear d11
d11: Concat/Stripe is cleared
# metaclear d21
d11: Concat/Stripe is cleared
...
# metadb | grep $disk
     a m  p  luo        16              8192            /dev/dsk/c0t0d0s7
     a    p  luo        8208            8192            /dev/dsk/c0t0d0s7
     a    p  luo        16400           8192            /dev/dsk/c0t0d0s7
# metadb -d ${disk}s7
# cfgadm -al | grep $disk
c0::dsk/c0t0d0                 disk         connected    configured   unknown
# cfgadm -c unconfigure c0::dsk/c0t0d0

Now we can physically replace the failed disk.

# tail -f /var/adm/messages
...
May 23 12:30:20 solaris genunix: [ID 408114 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@0,0 (sd0) offline
May 23 12:30:27 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1 (mpt0):
May 23 12:30:27 solaris    mpt_handle_event_sync : SAS target 0 added.
May 23 12:30:27 solaris scsi: [ID 583861 kern.info] sd0 at mpt0: unit-address 0,0: target 0 lun 0
May 23 12:30:27 solaris genunix: [ID 936769 kern.info] sd0 is /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@0,0
May 23 12:30:28 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@0,0 (sd0):
May 23 12:30:28 solaris    Corrupt label - label checksum failed
May 23 12:30:28 solaris genunix: [ID 408114 kern.info] /pci@1e,600000/pci@0/pci@a/pci@0/pci@8/scsi@1/sd@0,0 (sd0) online

Alright, let's configure the new disk in.

# cfgadm -c configure c0::dsk/c0t0d0
# format $disk

c0t0d0: configured with capacity of 68.35GB
selecting c0t0d0
[disk formatted]
...
format> label
Ready to label disk, continue? y

format> quit

Time to rebuild our SVM mirror.

# metastat d20
...
Device Relocation Information:
Device   Reloc  Device ID
c0t1d0   Yes    id1,sd@n500000e016af27c0

# sourcedisk=c0t1d0
# prtvtoc /dev/rdsk/${sourcedisk}s2 | fmthard -s - /dev/rdsk/${disk}s2
# metadb -a -c3 ${disk}s7
# metainit d11
# metainit d21
# metattach d10 d11
d10: submirror d11 is attached
# metattach d20 d21
d20: submirror d21 is attached
# metadevadm -u $disk
Updating Solaris Volume Manager device relocation information for c0t0d0
Old device reloc information:
        id1,sd@n500000e0135e2dd0
New device reloc information:
        id1,sd@n500000e0135e2dd0

# installboot /usr/platform/$(uname -i)/lib/fs/ufs/bootblk /dev/rdsk/${disk}s0

# metastat -c
d20              m   4.0GB d22 d21 (resync-1%)
    d22          s   4.0GB c0t1d0s1
    d21          s   4.0GB c0t0d0s1

Once the resync is done we have our OS disks mirrored again.

Links

Thursday, May 11, 2017

Ansible 2.3 on Solaris 11.3

Ansible 2.3 on Solaris 11.3

Let's automate some tasks we do all the time.

Yes I know, I could use Puppet, but the Puppet version bundled with Solaris 11.3 is rather old (3.6.2) and some modules are broken (e.g. "ldap provider always finds authentication_method out of sync (Bug 22166490)"). So we'll use Ansible instead.

$ wget http://releases.ansible.com/ansible/ansible-2.3.1.0.tar.gz

# cd /opt
# gzcat .../ansible-2.3.1.0.tar.gz | tar xf -
# cat ansible-2.3.1.0/requirements.txt
...
jinja2
PyYAML
paramiko
pycrypto >= 2.6
setuptools

Seems like we need some Python modules to get Ansible working. Let's install them.

# pkg install --no-backup-be jinja2 pyyaml setuptools
           Packages to install: 22
...

We don't need paramiko and pycrypto (we'll use SSH instead), see Replace PyCrypto usage with cryptography.io #13075.

To speed things up we need an SSH client that supports ControlPersist. So let's install OpenSSH as well and make it the default.

# pkg install --no-backup-be network/openssh
           Packages to install:  1
...
# pkg set-mediator --no-backup-be -I openssh ssh
            Packages to change:  3
...

Does it work?

# PYTHONPATH=/opt/ansible-2.3.1.0/lib /opt/ansible-2.3.1.0/bin/ansible --version
ansible 2.3.1.0
  config file =
  configured module search path = Default w/o overrides
  python version = 2.7.9 (default, Dec  1 2016, 10:32:39) [C]

# PYTHONPATH=/opt/ansible-2.3.1.0/lib /opt/ansible-2.3.1.0/bin/ansible localhost -m ping
 [WARNING]: Host file not found: /etc/ansible/hosts

 [WARNING]: provided hosts list is empty, only localhost is available

localhost | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Good. Time to write our first Playbook...

Links

Monday, May 8, 2017

FreeBSD aesni(4) and openssl

FreeBSD aesni(4) and openssl

I have to admit, I also like FreeBSD. There was a question if you need to kldload aesni to speed up openssl (or any application that's using libcrypto) and the short answer is no. The long answer...

What is AES-NI again?

The new AES-NI instruction set is comprised of six new instructions that perform several compute intensive parts of the AES algorithm. These instructions can execute using significantly less clock cycles than a software solution.

So AES-NI is basically just another mnemonic like ADD, SUB, XOR, MOV, AND, etc. You can just call them from user space. And that's what OpenSSL/LibreSSL does, see aesni-x86_64.S.

.globl aesni_cbc_encrypt
...
aesni_cbc_encrypt:
...
.byte 102,15,56,220,209
...

And there's our AESENC opcode. It's in .bytes notation because FreeBSD is still using a GPLv2 binutils that's too old for AES-NI mnemonics.

In Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 we see that the AESENC instruction has the following opcode.

66 0F 38 DC /r
AESENC xmm1, xmm2/m128

So ".byte 102,15,56,220,209" translates with some decimal to hexadecimal conversion to "0x66, 0x0F, 0x38, 0xDC". Which is just the opcode for AESENC above.

So there is no kernel stuff involved. What is aesni(4) used for then?

The aesni driver registers itself to accelerate AES operations for crypto(4).

In sys/crypto/aesni/aesni.c we see that the AES-NI kernel module registers itself as being hardware capable and for the following ciphers (see sys/crypto/aesni/aesni.c and crypto(9)).

static int
aesni_attach(device_t dev)
{
 struct aesni_softc *sc;
...
 sc->cid = crypto_get_driverid(dev, CRYPTOCAP_F_HARDWARE |
     CRYPTOCAP_F_SYNC);
...
 crypto_register(sc->cid, CRYPTO_AES_CBC, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_ICM, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_NIST_GCM_16, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_128_NIST_GMAC, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_192_NIST_GMAC, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_256_NIST_GMAC, 0, 0);
 crypto_register(sc->cid, CRYPTO_AES_XTS, 0, 0);
...

So the aesni kernel module provides crypto services with the OpenCrypto framework for both user space (that's the /dev/crypto device) via crypto(4) and kernel space (think of GELI and IPSec) via crypto(9).

Still not convinced that openssl is not using aesni(4)? Let's fire up DTrace just to be sure...

For the first test, no aesni(4) kernel module is loaded. We use openssl with -evp to make sure we're actually using hardware crypto (and you'll see it's calling libcrypto's aesni_cbc_encrypt function we talked about earlier). Notice there is almost no user/kernel space boundary crossing.

# kldstat 
Id Refs Address            Size     Name
 1    6 0xffffffff80200000 1fa7c38  kernel
 2    1 0xffffffff82219000 249d     ulpt.ko
 3    1 0xffffffff8221c000 adec     tmpfs.ko

# kldload dtraceall

# dtrace -n 'pid$target:libcrypto.so.8:*aesni*:entry { @[probefunc] = count(); }' \
  -c "openssl speed -elapsed -evp aes-128-cbc"
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       5973.14k    23210.84k    84090.94k   238951.75k   543361.71k
...
  aesni_cbc_encrypt                                           4105425

# dtrace -n 'fbt:kernel:copy*:entry /pid == $target/ { @bytes[probefunc] = quantize(arg2); }' \
  -c "openssl speed -elapsed -evp aes-128-cbc"
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     602153.10k   640760.18k   643890.94k   660515.73k   666079.99k
...
  copyin                                            
           value  ------------- Distribution ------------- count    
               4 |                                         0        
               8 |@                                        2        
              16 |@@@@@@@@@@@@                             16       
              32 |@@@@@@@@@@@@@@@@                         22       
              64 |@@@                                      4        
             128 |                                         0        
             256 |@@@@                                     5        
             512 |@@@@                                     5        
            1024 |                                         0        

  copyinstr                                         
           value  ------------- Distribution ------------- count    
             512 |                                         0        
            1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 6        
            2048 |                                         0        

  copyout                                           
           value  ------------- Distribution ------------- count    
               4 |                                         0        
               8 |@                                        1        
              16 |@@@@@@@@@                                15       
              32 |@@@@@@@@                                 14       
              64 |@                                        2        
             128 |@@@@@@@@@@@@                             21       
             256 |@@@                                      5        
             512 |@@@                                      6        
            1024 |                                         0        
            2048 |@@@                                      5        
            4096 |                                         0      

Let's load the aesni(4) kernel module now and check if any kernel aesni probes fire.

# kldload aesni
aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard

# dtrace -n 'fbt:aesni::entry /pid == $target/ { @[probefunc] = count(); }' \
  -c "openssl speed -elapsed -evp aes-128-cbc"
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     616989.41k   668602.69k   670494.73k   673865.32k   679105.88k
...

No probes fired. That means openssl is not using the aesni kernel functions at all.

BUT, we can make openssl use aesni(4) with -engine cryptodev. Let's check if any kernel aesni and user/kernel space boundary crossing probes fire now.

# kldload cryptodev
# openssl engine -c -tt
(cryptodev) BSD cryptodev engine
 [RSA, DSA, DH, AES-128-CBC, AES-192-CBC, AES-256-CBC]
     [ available ]
...

# dtrace -n 'fbt:aesni::entry /pid == $target/ { @[probefunc] = count(); }' \
  -c "openssl speed -elapsed -evp aes-128-cbc"
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       4056.66k    15608.91k    57565.04k   183731.20k   492147.10k
...
  aesni_cipher_setup_common                                         8
  aesni_freesession                                                 8
  aesni_newsession                                                  8
  aesni_cipher_alloc                                          3036256
  aesni_encrypt_cbc                                           3036256
  aesni_process                                               3036256

# dtrace -n 'fbt:kernel:copy*:entry /pid == $target/ { @bytes[probefunc] = quantize(arg2); }' \
  -c "openssl speed -elapsed -evp aes-128-cbc"
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       3602.78k    13827.67k    51851.57k   168562.00k   472572.78k
...
  copyinstr                                         
           value  ------------- Distribution ------------- count    
             512 |                                         0        
            1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 6        
            2048 |                                         0        

  copyout                                           
           value  ------------- Distribution ------------- count    
               2 |                                         0        
               4 |                                         8        
               8 |                                         1        
              16 |@@@@@                                    675536   
              32 |@@@@@@@@@@@@@@@@@@@@                     2603253  
              64 |@@@@@                                    649862   
             128 |                                         21       
             256 |@@@@@                                    609223   
             512 |                                         6        
            1024 |@@@@                                     495120   
            2048 |                                         5        
            4096 |                                         0        
            8192 |@                                        173512   
           16384 |                                         0        

  copyin                                            
           value  ------------- Distribution ------------- count    
               2 |                                         0        
               4 |                                         15       
               8 |                                         3        
              16 |@@@@@@@@@@@@@@@@@                        3278781  
              32 |@@@@@@@@@@@@@                            2603265  
              64 |@@@                                      649864   
             128 |                                         0        
             256 |@@@                                      609223   
             512 |                                         5        
            1024 |@@@                                      495120   
            2048 |                                         0        
            4096 |                                         0        
            8192 |@                                        173512   
           16384 |                                         0  

That's weird, -engine cryptodev seems to be on by default as soon as we loaded the cryptodev kernel module. Also note the huge amount of user/kernel space boundary crossing. That's a lot of data to be copied from user space to kernel space and back again for nothing. Mental note here, don't load the cryptodev kernel module ever unless using a hifn(4), safe(4) or ubsec(4) crypto accelerator.

Links

Tuesday, May 2, 2017

AI install server using a https IPS repo

AI install server using a https IPS repo

Last time we created a local IPS repository (see https://crc32c.blogspot.de/2017/04/https-ips-repository-using-pkgdepotd.html) and added the latest SRU to it (see https://crc32c.blogspot.de/2017/04/how-to-add-sru-to-local-ips-repository.html).

Now it's time to create an AI install server, add some customizations and netboot/netinstall our first server.

# zfs create tank/install/auto_install
# zfs create tank/install/webserver_files

# pkg install --no-backup-be install/installadm

# cp /etc/certs/CA/UNIX_Dep_CA.pem /install/webserver_files/
# chown webservd:webservd /install/webserver_files/UNIX_Dep_CA.pem

# svccfg -s svc:/system/install/server:default
svc:/system/install/server:default> setprop all_services/default_imagepath_basedir = /install/auto_install
svc:/system/install/server:default> setprop all_services/enable_webui = false
svc:/system/install/server:default> setprop all_services/manage_dhcp = false
svc:/system/install/server:default> setprop all_services/webserver_files_dir = /install/webserver_files
svc:/system/install/server:default> refresh
svc:/system/install/server:default> ^D

# svcadm enable svc:/system/install/server:default

# installadm create-service -n solaris11_3-sparc -p solaris=https://pkg.mycompany.com/solaris/
OK to use subdir of /install/auto_install to store image? [y|N]: y
...
100% : Created Service: 'solaris11_3-sparc'
...

Good, now let's edit/create the manifest and system configuration profile.

The AI_HOSTNAME, AI_IPV4, etc. variables are resolved using data supplied by our dhcpd server we'll setup in a few.

# installadm export -n solaris11_3-sparc -m orig_default -o orig_default
# cat orig_default
...
      <source>
        <publisher name="solaris">
          <origin name="https://pkg.mycompany.com/solaris/"/>
          <credentials>
            <ca_cert src="http://pkg.mycompany.com:5555/files/UNIX_Dep_CA.pem"/>
          </credentials>
        </publisher>
      </source>
...

# installadm update-manifest -n solaris11_3-sparc -f ./orig_default
Changed Manifest: 'orig_default'

# sysconfig create-profile -o sc
# cat sc/sc_profile.xml
...
  <service version="1" type="service" name="system/identity">
    <instance enabled="true" name="node">
      <property_group type="application" name="config">
        <propval type="astring" name="nodename" value="{{AI_HOSTNAME}}"/>
      </property_group>
    </instance>
  </service>

  <service version="1" type="service" name="network/install">
    <instance enabled="true" name="default">
      <property_group type="application" name="install_ipv4_interface">
        <propval type="net_address_v4" name="static_address" value="{{AI_IPV4}}/{{AI_IPV4_PREFIXLEN}}"/>
        <propval type="astring" name="name" value="{{AI_NETLINK_VANITY}}/v4"/>
        <propval type="astring" name="address_type" value="static"/>
        <propval type="net_address_v4" name="default_route" value="{{AI_ROUTER}}"/>
      </property_group>
    </instance>
  </service>
...
  <service version="1" type="service" name="system/ocm">
    <instance enabled="false" name="default">
      <property_group type="application" name="reg">
        <propval type="astring" name="opt_out" value="true"/>
      </property_group>
    </instance>
  </service>
...

# installadm create-profile -n solaris11_3-sparc -f sc/sc_profile.xml -p custom

Almost done with the AI part. Let's create our first client.

# installadm create-client -e 00:11:22:33:44:55 -n solaris11_3-sparc

And that's it. Now we need an DHCP server to assign hostnames, DNS server, IP addresses, etc. for netbooting.

# cat << EOF > /etc/inet/dhcpd4.conf
authoritative;
log-facility local7;

option domain-name "mycompany.com";
option domain-name-servers 10.74.0.53, 10.74.5.3, 10.74.53.53;
option domain-search "mycompany.com", "lab.mycompany.com";

deny unknown-clients;

class "SPARC" {
  match if substring (option vendor-class-identifier, 0, 5) = "SUNW.";
  filename "http://pkg.mycompany.com:5555/cgi-bin/wanboot-cgi";
}

subnet 10.79.85.0 netmask 255.255.255.128 {
  option routers 10.79.85.1;
  option broadcast-address 10.79.85.127;
  option ntp-servers 10.79.85.1;
  next-server pkg.mycompany.com;
  use-host-decl-names on;
}

host ldg1 {
  hardware ethernet 00:11:22:33:44:55;
  fixed-address 10.79.85.101;
}
EOF

# chgrp sys /etc/inet/dhcpd4.conf
# /usr/lib/inet/dhcpd -t -cf /etc/inet/dhcpd4.conf

# printf "local7.debug\t\t\t\t\t/var/log/dhcpd.log\n" >> /etc/syslog.conf
# touch /var/log/dhcpd.log
# chgrp sys /var/log/dhcpd.log
# svcadm restart svc:/system/system-log:default

# echo "/var/log/dhcpd.log -C 4 -a '/usr/sbin/svccfg -s svc:/system/system-log:default refresh'" > /etc/logadm.d/dhcpd.logadm.conf
# chmod 444 /etc/logadm.d/dhcpd.logadm.conf
# chgrp sys /etc/logadm.d/dhcpd.logadm.conf
# svcadm refresh svc:/system/logadm-upgrade:default

# svcadm enable svc:/network/dhcp/server:ipv4

Time to install our first client.

{0} ok boot net:dhcp - install
...
13:13:33    Saving credential file UNIX_Dep_CA.pem
13:13:34    Creating the CA certificate symbolic link(s)
...
13:14:23    Installing packages from:
13:14:23        solaris
13:14:23            origin:  https://pkg.mycompany.com/solaris/
...
Automated Installation finished successfully

Good bye text-install ISOs...

Links

389 Directory Server 1.3.x LDAP client authentication

389 Directory Server 1.3.x LDAP client authentication Last time we did a multi-master replication setup, see 389 Directory Server 1.3.x Repl...