Department of Biostatistics and Bioinformatics Computing Resources

Computing Servers Specs and Setup:

Hardware

  1. Two 16-core AMD Opteron 8222SE servers with 64 (expandable to 128)GB of RAM
  2. One 8-core AMD Opteron 8222SE server with 32 (expandable to 64)GB of RAM
  3. The OS platform is CENTOS 5 (a whitebox version of RHEL 5)
  4. Each server has useable 2.1TB of local storage (with RAID 10 redundancy)
  5. Each machine mounts the user's OIT AFS share as the home directory
  6. Each machine also mounts the storage shares from the other two machines as network shares (cool feature):
    -bash-3.1$ df -H
    Filesystem             Size   Used  Avail Use% Mounted on
    /dev/mapper/VolGroup00-root
                           466G   2.1G   440G   1% /
    /dev/md0               2.1G    49M   2.0G   3% /boot
    tmpfs                   34G      0    34G   0% /dev/shm
    AFS                    9.3G      0   9.3G   0% /afs
    /dev/sdc1              2.2T   284M   2.1T   1% /srv/bb16a
    biostat-bioinfo-03:/srv/bb8a
                           2.2T   2.6G   2.1T   1% /srv/bb8a
    biostat-bioinfo-02:/srv/bb16b
                           2.2T   208M   2.1T   1% /srv/bb16b
    
    

Select List of Installed Software (as of 01/09/2008)

  1. R 2.6.1 along with core packages
  2. Bioconductor release 2.1 with core packages
  3. GNU 4.1.1 C/C++/Fortran compilers
  4. lam, lam-dev, mpicc, mpiCC, mpif77
  5. Python, Perl, SUN Java
  6. emacs, pico, vi
  7. subversion and cvs
  8. teTeX, psutils,enscript, a2ps, ispell,aspell
  9. other standard GNU tools

Other Available Software

  1. Any RPM package on the CENTOS yum repositories (free open-source software)
  2. Need to discuss with OIT on how to install non-RPM open-source software
  3. Intel C++ and Fortran (no fee for academic use)
  4. Mathematica and Maple (no fee through OIT site-license)
  5. Matlab, SAS and IMSL (requires payment of license fees)

Login Instructions

Use your favorite ssh client to access the servers using your NETID:
$ ssh owzar001@bb16a.oit.duke.edu
$ ssh owzar001@bb16b.oit.duke.edu
$ ssh owzar001@bb8a.oit.duke.edu
Use the -X switch to enable X forwarding
$ ssh -X owzar001@bb16a.oit.duke.edu
$ ssh -X owzar001@bb16b.oit.duke.edu
$ ssh -X owzar001@bb8a.oit.duke.edu
If you use one of those "other" so called operating systems, you can download an ssh client from the OIT site.

Hardware, Software and Account Management

  1. The servers are managed by Duke OIT
  2. You need to have a NETID to be able to use these resources
  3. If you have a NETID but forgotten your password follow the information provided here
  4. For account and software requests please e-mail kouros[dot]owzar[at]duke[dot]edu

Installing R and Bioconductor packages

  1. You can requests R or Bioconductor packages to be added
  2. Alternatively, you can install (with relative ease) local version of R packages.
-bash-3.1$ R
R> install.packages("xtable")
Warning in install.packages("xtable") :
  argument 'lib' is missing: using '/usr/local/lib/R/site-library'
Warning in install.packages("xtable") :
  'lib = "/usr/local/lib/R/site-library"' is not writable
Would you like to create a personal library
'~/R/i486-pc-linux-gnu-library/2.6'
  • By default, the package will be installed in ~/R/i486-pc-linux-gnu-library/2.6. Note that following installation, the package is in your R path
  •  
    -bash-3.1$ R
    R> .libPaths()
    [1] "/afs/acpub/users/o/w/owzar001/R/x86_64-redhat-linux-gnu-library/2.6"
    [2] "/usr/lib64/R/library"
    
  • Given that the AFS share is mounted on each server you only have to do one local installation.
  • Example

    1. To illustrate the parallelization facilities provided by these servers, we will discuss a simple xample of little or no practical use
    2. We will generate B replicates from the sampling distribution of the MAD (Median Absolute Difference) of ' based on pseudo random samples of size n
    3. We will use the facilties provided by the R packages snow and Rmpi
    4. The following function generates B replicates of the MAD based on pseudo random samples of size n (default 100000) from a standard normal law
       
      > foo=function(B,n=100000){replicate(B,mad(rnorm(n)))}
      > B=16000
      

    Example (single thread)

    Let us try this function on a single thread:
     
    > B=16000
    > unix.time(foo(B))
       user  system elapsed
    806.520   8.029 814.696
    

    Example (two nodes)

    We will create an MPI cluster with two nodes. In R, the snow package provides the necessary interface. The makeCluster() command creates the cluster (note that it loads the Rmpi package), while the stopCluster() command terminates the cluster. Note that given that job is to be slit up between two nodes, each node will produce B/2 (=8000) replicates.
     
    > library(snow)
    > k=2
    > cl=makeCluster(k)
    Loading required package: Rmpi
            2 slaves are spawned successfully. 0 failed.
    > unix.time(clusterCall(cl,foo,B=B/k))
       user  system elapsed
      0.007   0.001 424.004
    > stopCluster(cl)
    

    Example (four nodes)

    Next, try it using four nodes:
     
    > k=4
    > cl=makeCluster(k)
            4 slaves are spawned successfully. 0 failed.
    > unix.time(clusterCall(cl,foo,B=B/k))
       user  system elapsed
      0.012   0.001 217.528
    

    Example (eight nodes)

    Next, try it using eight nodes:
     
    > k=8
    > cl=makeCluster(k)
            8 slaves are spawned successfully. 0 failed.
    > unix.time(clusterCall(cl,foo,B=B/k))
       user  system elapsed
      0.019   0.002 129.770
    > stopCluster(cl)
    

    Example (sixteen nodes)

    Finally, let us try using all 16 cores on bb16a:
     
    > k=16
    > cl=makeCluster(k)
            16 slaves are spawned successfully. 0 failed.
    > unix.time(clusterCall(cl,foo,B=B/k))
       user  system elapsed
      0.043   0.001  90.829
    > stopCluster(cl)
    

    Example (summary)

     
     nodes minutes  cmp
    1     1    13.5 13.5
    2     2     7.1  6.8
    3     4     3.6  3.4
    4     8     2.2  1.7
    5    16     1.5  0.8
    

    Concluding remarks

    1. The actual speedup depends on many factors (MPI overhead, system resource availability, etc)
    2. Most importantly, it will depend on the fraction of the job that is serialized (refer to Amdahl's law)
    3. the kind of parallelization considered in the example work best for so called Embarrassingly_parallel problems (e.g., permutation resampling)
    4. You can use the lamhalt command (from the shell) to kill all lam jobs
    5. The Intel compilers provide OpenMP parallelization support (gcc 4.2 is provides this as well).