Introduction
============

This guide covers the operation of the pipeline, both in production and for the
purpose of running sensitivity studies (simulated background and signal events
for various populations). It also describes troubleshooting scenarios
and strategies.

How Data is Organized
=====================

Persistent data goes into ``$XDG_DATA_HOME/llama`` or, if ``$XDG_DATA_HOME`` is
not defined, into ``~/.local/share/llama``. This includes analysis outputs,
archived alerts, and log files. This directory will be called the "output
directory" or ``$OUTPUT_DIR`` in this section.

Event Directories
~~~~~~~~~~~~~~~~~

Events each get their own directory whose names are their
respective LIGO/Virgo GraceIDs (i.e. their unique event IDs on GraceDB). Event
directories for the current LIGO/Virgo run are all held in
``$OUTPUT_DIR/current_run``. Event directories for old events/manual runs, if
they are saved, will be in different run directories, each of which should be
located in ``$OUTPUT_DIR/past_runs``, so that, for example, GW170817, the
world's first BNS merger detection, can be found on the current LLAMA analysis
server in ``~/.local/share/llama/past_runs/events-o2/G298048``.

GCN Notice Archive
~~~~~~~~~~~~~~~~~~

All GCN notices heard by the server while ``gcnd`` is running will, by default,
be archived to ``$OUTPUT_DIR/gcn`` (this includes non-LIGO/Virgo triggers). If
you wish to view these alerts, you can look in this directory, which contains
subdirectories ``YEAR-MONTH/DAY`` (to avoid having too many files in a single
directory).

Log Files
~~~~~~~~~

Logs are stored in ``$OUTPUT_DIR/logs``. While formats and contents for
different log files differ based on which program created them, they are
generally named after the script that created them. The LLAMA update daemon,
``llama run``, writes to logfiles with ``llamad`` in their names; the GCN
listener, ``llama listen gcn``, writes to ``gcn.log``. It's often useful to
know which log file was written to most recently (e.g. to find out whether the
script you ran is faithfully logging output as expected); you can do this with
``ls -ltr``.

Running the Pipeline Automatically
==================================

*This section will be written later.*

Running the Pipeline Manually
=============================

Several crucial files can be generated manually (that is to say, without
relying on automated trigger acquisition and pipeline execution). This section
describes common usage scenarios.

Creating a New Event with the skymap_info CLI Script
----------------------------------------------------

``skymap_info`` is a command line script for creating new event directories
along with a new ``skymap_info.json`` file describing the gravitational wave
event and specifying the GW skymap that is to be used. Creating this file is
the first step in any GW multimessenger analysis in LLAMA (see the
documentation for ``llama.files.skymap_info.SkymapInfo`` for more
information).

Ordinarily this file is generated by one of the listener scripts (e.g. the ones
listening for new LVAlerts from LIGO/Virgo's GraceDB or the GCN Notice
listener). The ``skymap_info`` script can be used to generate this file
manually, however, using either the event's GraceID (it's unique ID on GraceDB)
or using a VOEvent XML file (as generated on GraceDB and distributed as a GCN
Notice). There are many options associated with making a new
``skymap_info.json`` file (run ``skymap_info -h`` to see full documentation of
the command line interface), but by default, all you need to specify is either
the GraceID or a VOEvent file and everything else will be inferred.

Using Defaults with ``skymap_info``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In the simplest case, you can create a new event and its corresponding
``skymap_info.json`` file using an existing VOEvent with:

.. code:: bash

   skymap_info --voevent path/to/voevent.xml

or using the event's GraceID (and, implicitly, the most recently uploaded
skymap) with:

.. code:: bash

   skymap_info --graceid GRACEID

Where ``GRACEID`` can be either a conventional GraceID (starting with "G" for
real events, "M" for mock events) or a superevent GracceID (starting with "S"
for real superevents, "MS" for mock superevents). In most cases, you will want
to run with a superevent GraceID, since this is the official public data
product released by LIGO/Virgo.

Whether you use ``--voevent`` or ``-graceid``, you will now be prompted to
specify whether this event is a real event or a test event. **Please answer
this question carefully, as it determines whether results will be automatically
published to collaborators.** If it is a real event, data products will go out
to collaborators (via Slack and possibly other methods) when you run ``llama
run`` to update the rest of the contents of the directory. This is the same
behavior you'd get from the pipeline when it's automatically triggering off of
real data. Of course, if you're looking at a real event, this is a good thing,
so don't be shy about saying ``YES``; just make sure you answer *correctly* so
as not to bother or confuse people needlessly.

Sensitivity and Background Studies (O3B)
========================================

*These instructions apply starting in the second half of O3 (O3B). See the next
section for pre-O3B instructions. You might still want to read those
instructions for a more detailed overview of how tools work: this section is
currently underdeveloped.*

Organizing Data
---------------

Make sure that all data is available on DigitalOcean Spaces and is organized
into directories by GraceID and background neutrino list names. For mid-O3
subthreshold runs, this data is located in:

.. code::

   s3://llama/llama/artifacts/mid-o3-subthreshold-background/

Contents at run time:

.. code::

                          DIR   s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist/
                          DIR   s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents/
   2019-11-04 00:45   1088890   s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist_names.txt
   2019-10-26 20:32     30656   s3://llama/llama/artifacts/mid-o3-subthreshold-background/missing_event_ids_cwb.npy
   2019-11-04 00:30        27   s3://llama/llama/artifacts/mid-o3-subthreshold-background/placeholder.json
   2019-11-03 23:58       244   s3://llama/llama/artifacts/mid-o3-subthreshold-background/skymap_info.json.template
   2019-11-04 00:44     67832   s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents.txt

Provisioning Servers
--------------------

In principle, this can be run anywhere that has an internet connection,
supports Docker (or has LLAMA dependencies and library installed), and can
provide 8GB of memory for peak usage. These instructions assume you are using
DigitalOcean droplets provisioned from a seed image using ``llama com do``.

Start by getting a running server with ``llama`` installed under
``~/dev/multimessenger-pipeline``. Make sure it doesn't need any extra
``$PATH`` configuration in order to work. Create a snapshot of this server.
Let's say it's ``mid-o3-subthreshold`` for the sake of provisioning a bunch of
servers for the parallel runs.

Follow the instructions in the pre-O3B section below to set things up, then
launch with:

.. code:: bash

   llama dev dv batch

This will run ``bin/mid-o3-subthreshold-background-event-runner`` on all
servers whose names start with ``llama-``. Make sure code is running on those
servers with:

.. code:: bash

   llama dev dv ls-procs

Fetch results from S3 using ``s3cmd``

.. code:: bash

   s3cmd get --recursive \
       s3://llama/llama/artifacts/mid-o3-subthreshold-background/mid-o3-subthreshold-background/ \
       .

Finally, collate files into a results table as described below and put them
into the background distribution file used for calculating p-values.

Sensitivity and Background Studies (pre-O3B)
============================================

At time of writing, LLAMA uses DigitalOcean_ hosting for sensitivity studies.
There are a set of scripts used for setting up a large (~100) number of
identical DigitalOcean "droplets" (virtual private server instances), getting
analyses running on them, collating the outputs, and then spinning down those
droplets.

.. _DigitalOcean: https://cloud.digitalocean.com/

Getting Started
---------------

Before you can spin up any servers, you'll need to have someone on the GECo
DigitalOcean_ account invite you to join the team. Visit the `team management
page <https://cloud.digitalocean.com/account/team>`__ and invite the new user
to the team. You will need to create a DigitalOcean_ account using the email
provided (or log in if you already have one), at which point you will be
able to switch users to the GECo team account by clicking the profile button in
the top right corner of the DigitalOcean_ web interface.

Generating API Tokens
---------------------

You'll now need an API token in order to be able to access DigitalOcean using
LLAMA's utility scripts (i.e. without using the web interface). **Your access
token is basically like a password, and you should treat it with EXTREME
care.** If someone gets access to your token, they'll be able to rack up
charges on the team server (i.e. by mining crypto currency or something) and
even delete our data. Ideally, once you create your token, you should only
store it on the computer that will be using it, and you should make sure not to
put copies anywhere where they can be abused, e.g. a git repository (a **very**
common mistake!!). You should also delete old tokens as long as they are not
being used any more or if you have the slightest suspicion that any have been
compromised.

Go to the `API token page <https://cloud.digitalocean.com/account/api>`__ and
create a new token, making sure you check the box for ``write`` access. (You'll
only have one chance to copy this token down; if you accidentally navigate away
from the page, you'll just need to delete that token and generate a new one.)
Now, you can add this token to your command line environment by sticking the
following line somewhere in your ``.bashrc`` (or ``.bash_profile`` for MacOS):

.. code:: bash

    export DIGITALOCEAN=<YOUR-TOKEN-HERE>

Where, of course, ``<YOUR-TOKEN-HERE>`` is replaced with the API token you just
generated. Close and reopen your shell window to make sure that this
environmental variable is loaded. You can check that it's loaded by running:

.. code:: bash

    echo $DIGITALOCEAN

which should print the API token you just created. If it didn't, make sure you
edited the correct file (again, MacOS uses ``~/.bash_profile``, Linux uses
``~/.bashrc``), that you wrote the ``export`` command above (with *no spaces
around the* ``=`` *sign!*), and that you restarted your ``bash`` shell after
saving so that the startup script loaded the new variable.

Editing your .bashrc
~~~~~~~~~~~~~~~~~~~~

*Note: your* ``.bashrc`` *or* ``.bash_profile`` *(on MacOS) files are scripts
that run whenever* ``bash`` *starts up. They let you do things like set
environmental variables (what we're doing now) and other useful things to
customize your shell and make it more useful. You can edit your* ``.bashrc``
*or* ``.bash_profile`` *with:*

.. code:: bash

    # On Linux
    nano ~/.bashrc

    # On MacOS
    nano ~/.bash_profile

*and then add the same* ``export DIGITALOCEAN=<YOUR-TOKEN-HERE>`` *code
mentioned above.*

Using the DigitalOcean API
--------------------------

You can use the command-line utility ``llama com do`` for interacting with the
DigitalOcean API through the command line (our scripts for running sensitivity
studies rely on this script). If you have the LLAMA pipeline installed, you can
find this script in ``<REPODIR>/bin`` (where ``<REPODIR>`` is the location of
the LLAMA software directory).

Since the scripts we are using are mostly located in ``<REPODIR>/bin``, it's
probably easiest to clone the full LLAMA repository (if you haven't already).
If you already have this code and have ``<REPODIR/bin>`` in your ``$PATH``
(i.e.  you can use the commands it contains at the command line), you can skip
the next subsection.

LLAMA Scripts
~~~~~~~~~~~~~

All llama scripts used for background runs are part of the ``llama``
distribution's command line tools. Use a Docker image or :ref:`install the
pipeline locally <quick-start-install>` to get access to them. The two main
tools you'll be using are ``llama com do`` to start up DigitalOcean
servers/interact with the infrastructure and ``llama dev drop`` to control the
servers.

Adding Your SSH Keys to the Droplets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You will need to be able to SSH into the droplets where the analysis is being
run in order to get the analyses started, check on their statuses, and fetch
output data. In order to do this, you will need to give an SSH key (sort of
like a password that is unique to your computer) to DigitalOcean_ under the
GECo account. By default, ``llama com do`` will add all available GECo
SSH keys to newly provisioned droplets (so that all team members can access
analysis servers at will).

For starters, you'll have to :ref:`make SSH keys <generate-ssh-keys>` if you
haven't already. If you put your key in the default location, you can print it
to the terminal with:

.. code:: bash

   cat ~/.ssh/id_rsa.pub

Next, make sure you're logged in as GECo on DigitalOcean_, then copy the above
key and navigate to the `DigitalOcean security page`_ to add your SSH key.
Create a new SSH key and name it something that makes it clear who you are and
which device you are using (e.g. ``Stef-Laptop``).

.. _`DigitalOcean security page`: https://cloud.digitalocean.com/account/security

Once you've saved your SSH key, you'll have automatic access to all future
droplets created through the ``llama com do`` interface.

Installing Requirements
~~~~~~~~~~~~~~~~~~~~~~~

The core CLI tools are part of the LLAMA distribution, though you will also
need to have the developer packages installed (defined in
``requirements-dev.txt`` and as Conda packages in the Dockerfiles in
``/docker/llama-dev/``). You can use the developer Docker images to satisfy the
non-python dependencies or install them on your local system.

bash 4.X (MacOS)
^^^^^^^^^^^^^^^^

The scripts and examples in this guide were written with ``bash`` 4 in mind. If
you are on any recent Linux system, you will almost certainly be running
``bash`` 4 or later. MacOS systems, however, come with ``bash`` 3 installed by
default (due to licensing issues, they cannot distribute newer versions). To
install the latest version of ``bash`` on MacOS:

1. Install MacPorts_ using their instructions.
2. Install ``bash`` with ``sudo port install bash``.
3. Make a copy of the old list of shells (used by the system to keep track of
   available shells) with ``cp /etc/shells ~/etc-shells``.
4. Edit the list of shell files with ``sudo nano /etc/shells`` and add
   ``/opt/local/bin/bash`` on a new line. Feel free to use a different text
   editor than ``nano``, e.g. the default graphical text editor with ``sudo
   open /etc/shells``.
5. Run ``chsh -s /opt/local/bin/bash`` to tell your computer that the MacPorts
   version of ``bash`` is the one you would like to use by default. You will
   need to enter your password to do this.
6. Without closing your current shell window, open a new window and run
   ``echo $BASH_VERSION``. You should see something starting with "4".

.. _MacPorts: https://www.macports.org

If things look weird, or if the terminal doesn't start properly, go back to the
old shell window you were using, restore the old version of the ``/etc/shells``
file (in case you broke it) by running ``sudo cp ~/etc-shells /etc/shells``
followed by ``chsh -s /bin/bash`` to go back to the default system ``bash``.
Open a new window to confirm that you've fixed the damage (everything should
now be back to normal), then carefully retry the above steps.

DigitalOcean Administration Examples
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Show full help documentation with:

.. code:: bash

   llama com do -h

You can list the current running droplets using:

.. code:: bash

   llama com do -D

If you only want to list droplets whose name start with "llama", you can add a
wildcard to the end of the command (as if you were adding a wildcard to match a
bunch of files in a folder):

.. code:: bash

   llama com do -D llama*

You can also print just the IP addresses of these servers by telling
``llama com do`` to print only those columns (you can choose any
combination of columns besides this example; see the help documentation with
the ``-h`` flag for a full list):

.. code:: bash

   llama com do -D llama* -C ip

You can get rid of the header row that labels the columns (useful if you are
going to pipe those IP addresses to another script) by adding the ``-n`` (no
headers) flag:

.. code:: bash

   llama com do -n -D llama* -C ip

You can list the currently saved snapshots on your DigitalOcean_ account with:

.. code:: bash

   llama com do -S

.. _provision-single-droplet:

Create a single droplet called ``llama-provision`` from the snapshot
called ``llama-parallel-early-er14-o2-bg`` with:

.. code:: bash

   llama com do -c llama-provision -i llama-parallel-early-er14-o2-bg

You can create 98 servers numbered ``llama-parallel-00`` through
``llama-parallel-97`` (0 included) using the
``llama-parallel-early-er14-o2-bg`` snapshot with:

.. code:: bash

   llama com do \
       -c llama-parallel-{00..97} \
       -i llama-parallel-early-er14-o2-bg

You can delete all servers whose names start with "llama" with the ``-d`` flag;
you'll be asked for final confirmation before the deletion happens, so make
sure to double check that the servers listed are the ones you want to actually
delete (**deleting droplets is NOT reversible!**):

.. code:: bash

   llama com do -d llama*

*(be sure to type "y" in the prompt to confirm that you want to create the new
droplets; note that it will take a while for them to start up. You can monitor
their progress by running* ``llama com do -D`` *as mentioned above;
droplets that will be listed as* ``new`` *while they are provisioning, and you
will not be able to log into them or otherwise use them until they are finished
starting up and are listed as* ``active``.

Preparing a Server
------------------

When you run this analysis in parallel, you'll be doing it on a bunch of
different virtual servers. In order to keep things as simple as possible, you
want these servers to be identical in all respects (except, possibly, for input
files, which will need to be different if you're not doing some sort of Monte
Carlo sampling like we do in the background studies). In particular, you'll
want to copying large files to and from the individual servers; a 7GB list of
skymaps will eat up a whole day on a good connection if you need to copy it
from Columbia's network to 100 separate DigitalOcean_ servers.

Server Preparation Overview
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The solution (described in detail where necessary in the links below) is to:

1. Create a single droplet, most likely cloning it from the disk snapshot used
   for the most recent parallel run (see the :ref:`code example
   <provision-single-droplet>` above)
2. Set everything up on it by moving data files into place and delete old
   output files; see the :ref:`section on moving files into place
   <moving-data-into-place>` below.  3. Shut it down at the command line with
   ``sudo shutdown now``.
4. Take a snapshot of the droplet by opening it on the DigitalOcean_ control
   panel, going to the snapshot page, entering a descriptive name (try to make
   it clear when this run happened and which set of skymaps it was run on), and
   clicking the make snapshot button. Making the snapshot will take ~1 hour, so
   go do something else while you wait.

Once you've done these tasks, you can :ref:`spin up servers <starting-the-servers>`
and :ref:`get the analysis started <running-the-analysis-in-parallel>`. Step 2 is
the most complicated and most subject to change; the following section
describes it in detail.

.. _moving-data-into-place:

Moving Data into Place
~~~~~~~~~~~~~~~~~~~~~~

Where to Get Simulated Skymaps
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

*Note: this whole section will be run on some LIGO DataGrid server. You will
need to SSH into it with* ``gsissh``.

At time of writing, simulated skymaps are located on
``ldas-pcdev1.ligo-la.caltech.edu``. Enter your LIGO credentials with:

.. code:: bash

   kinit albert.einstein@LIGO.ORG
   
And enter your LIGO password (obviously replace "albert.einstein" with your
LIGO username; you'll need DataGrid credentials for this, so make sure to set
those up if you haven't already). Now create a grid proxy from the Kerberos
credentials initialized above with:

.. code:: bash

   ligo-proxy-init -k

Now, you can SSH into the LIGO DataGrid mentioned above:

.. code:: bash

   gsissh ldas-pcdev1.ligo-la.caltech.edu

(Alternatively, if you are logging in with ``ssh albert.einstein@ssh.ligo.org``,
you would select ``LLO`` and then ``pcdev1``.) Navigate to the directory
where Rainer has been keeping skymaps:

.. code:: bash

   $ cd /home/kenneth.corley/GWHEN_sample_skymaps
   $ ls
   BBH_design_sample_skymaps  BBH_O3_sample_skymaps      BNS_O3_sample_skymaps
   BBH_O2_sample_skymaps      BNS_design_sample_skymaps  README.txt

You should see a list of directories whose names describe the type of skymaps
contained. Skymaps should be in a directory called something like ``allsky``
within one of these directories. Sometimes the skymaps are split up between
multiple runs. For example, to get the O3 sensitivity above-threshold BNS
skymaps, run:

.. code:: bash

   cd BNS_O3_sample_skymaps/BNS_O3_1/allsky

There should be very many (8978) ``.fits`` skymap files (along with other
output files generated by the skymap simulations) in this directory. (We'll
only zip up ``BNS_O3_1`` since most of the skymaps are in this directory; we will
ignore ``BNS_O3_2`` for now.) You will need to compress these skymaps into
``.fits.gz`` files (the format required by the LLAMA pipeline). You can't write
to Rainer's directories, so you'll need to make a directory in your own home
directory to zip these files to (let's call it ``BNS_O3_sample_skymaps`` for
clarity). Run the following command from the ``allsky`` folder above:

.. code:: bash

    mkdir ~/BNS_O3_sample_skymaps
    # run this next line from within the directory containing the skymaps
    # you're zipping
    find . -name '*.fits' | while read fitsfile; do
        echo "On ${fitsfile}...";
        gzip \
            <"${fitsfile}" \
            >~/BNS_O3_sample_skymaps/"$(basename "${fitsfile}")".gz;
    done

(This will take a while to complete.) Once you've zipped these files, you can
take that output directory and *zip it up again* into a compressed tarball
(``.tar.gz`` extension). This will put the whole directory into a single file,
which makes it easier to move around (the final file will be ~10GB in size).

.. code:: bash

   cd
   tar -czf BNS_O3_sample_skymaps.tar.gz BNS_O3_sample_skymaps

You now have a tar archive of the skymaps you want to use! Congratulations.
You'll probably want to download this directly from the server you are
provisioning (since the connection between DigitalOcean's servers and LIGO's
data centers is probably faster than your home connection). Read on for
instructions on where to move the files on the DigitalOcean server you are
provisioning.

Copying Skymaps into a Droplet
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

*This section should be run on the DigitalOcean droplet you are provisioning.*

SSH into the DigitalOcean_ server you've created for provisioning purposes. All
files for background and sensitivity studies are located in
``~/pre-er14-background`` (for historical reasons). Navigate there and list the
contents:

.. code:: bash

   $ cd ~/pre-er14-background
   $ ls
   bklist   neutrinos                 signal-events
   events   scratchevents             signal-neutrinos
   LOCKDIR  signal-doublet-events     skymap-lists
   logs     signal-doublet-neutrinos  skymaps

These directories contain inputs and outputs for the background and sensitivity
studies. We are particularly interested in:

- ``bklist``: Lists of neutrinos used for background calculations
- ``skymaps``: A directory containing subdirectories filled with skymaps (since
  you might want to lump together a few runs worth of simulated skymaps, each
  in a separate directory, into a single background run) used
  by both background and sensitivity runs. For example, you
  might put a directory called ``allsky`` filled with ``.fits.gz`` skymaps into
  ``skymaps``, so that a typical input skymap called ``0.fits.gz`` would be
  located at ``~/pre-er14-background/skymaps/allsky/0.fits.gz``.
- ``skymap-lists``: Lists of filepaths to skymaps in ``skymaps`` that specify
  subsets of the full set of skymaps. These are used to distinguish between 2-
  and 3-detector cases; each list will have skymaps that correspond to a
  specific set of GW detectors that participated in the reconstruction of that
  skymap. When you :ref:`start the analysis <starting-the-analysis>`, you can
  specify one of these lists to only run on skymaps corresponding to that
  detector configuration. These skymap lists contain one full skymap path per
  line.

  You can make these skymap lists by downloading the ``coincs.dat`` file from
  Rainer's directories on ``ldas-pcdev1.ligo-la.caltech.edu``. The fastest way
  to get these and the skymaps is do download them from our DigitalOcean Spaces
  directories; simulated skymaps are located in subdirectories of
  ``https://llama.nyc3.digitaloceanspaces.com/simulations/ligo-la/~kenneth.corley``
  with names like ``BNS_O3_sample_skymaps.tar.gz`` providing discriptive titles (you
  will also need the filename hash, so the easiest way to do this is to list
  files in that subdirectory through a command-line interface or the Spaces
  web-browser). Within the unzipped tarfile, you'll be able to find the
  ``coincs.dat`` files in the specific simulation run you want to use; in the
  above example, that would be in the ``BNS_O3_1`` subdirectory. For reference,
  on the LLO LDG servers (``ldas-pcdev1.ligo-la.caltech.edu``), ``coincs.dat``
  would be in the
  ``/home/kenneth.corley/GWHEN_sample_skymaps/BNS_O3_sample_skymaps/BNS_O3_1``
  directory. Once you have coincs.dat, you can extract the skymap filename
  lists with something like:

  .. code:: bash

     for dets in H1,L1 H1,L1,V1 H1,V1 L1,V1; do START="$dets" python -c 'import os; open("../skymap-lists/bns-o3/"+os.environ["START"].replace(",", "")+".csv", "w").write("\n".join(f"/home/vagrant/pre-er14-background/skymaps/{l.split()[1]}.fits.gz" for l in open("participating-detectors-bns-o3.txt").readlines() if l.split()[0] == os.environ["START"])+"\n")'; done

  confirm that the *total* number of filenames matches the number of filenames
  split between the lists (though this list's length should be one greater
  because it contains an initial header line):

  .. code:: bash

     wc -l ~/pre-er14-background//skymap-lists/bns-o3/*
       2013 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv
       5507 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1V1.csv
        703 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1V1.csv
        755 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/L1V1.csv
       8978 total
     wc -l ~/pre-er14-background/skymaps/coincs-bns-o3.dat
     8979 coincs-bns-o3.dat

  The file contents should look something like this:

  .. code:: bash

     $ head ~/pre-er14-background/skymap-lists/bns-o3/H1L1V1.csv
     /home/vagrant/pre-er14-background/skymaps/bns-o3/1.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/2.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/3.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/4.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/5.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/6.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/8.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/16.fits.gz
     /home/vagrant/pre-er14-background/skymaps/bns-o3/17.fits.gz

- ``logs``: Contains logfiles for each run; background logfiles are just a
  timestamp (an integer) followed by ``.log`` (with the largest number
  representing the latest timestamp, and hence the most-recently started run).
- ``events``: Output run directory containing one directory for each simulated
  background trigger (i.e. a GW skymap with a list of neutrinos plus all
  outputs of the analysis).

.. _running-the-analysis-in-parallel:

Running the Analysis in Parallel
--------------------------------

.. _starting-the-servers:

Starting the Servers
~~~~~~~~~~~~~~~~~~~~

See how many servers you currently have running with:

.. code:: bash

   llama com do -D | wc -l

List all server names with:

.. code:: bash

   llama com do -D

You should probably create a single server to begin. You can make sure that
parallel operations work on this server before you save a DigitalOcean
snapshot of it; create a single server using the command below and :ref:`start
the analysis <starting-the-analysis>` for that single server to see if any bugs
pop up. (If you haven't run the analysis in parallel in a while, there might be
bugs; if you've made major architectural changes since the last background run,
there are almost certainly bugs.)

.. code:: bash

   llama com do -c llama-parallel-bns-o3-00 -i llama-parallel-bns-o3

Once you've debugged things on this server and gotten the analysis working, you
can update the base image (``llama-parallel-bns-o3`` in this example) through
`DigitalOcean's web interface <https://cloud.digitalocean.com>`__ from the
working copy and spin up many more servers. Let's say you're limited to 100
servers and are already using a few, so you want to spin up 94 more instances
named ``llama-parallel-bns-03-00`` through ``llama-parallel-bns-03-94``
(assuming you're keeping the image you created in the previous step, you only
need to create the remaining 94 images):

.. code:: bash

   llama com do -c llama-parallel-bns-o3-{01..94} -i llama-parallel-bns-o3

``llama dev dv`` (and any vestigial ``pre-er-14-*`` CLI utilities that have not
yet been ported to ``llama dev dv``) are automatically vectorized, so the
following commands should apply equally-well regardless of
whether you are running one server or 1,000. Outside of edge-case bugs, things
should run smoothly with many servers running once you've fixed things in the
single-server case.

.. _starting-the-analysis:

Running the Analysis
~~~~~~~~~~~~~~~~~~~~

If everything has been set up properly, you can start a background run. First,
add all of the existing DigitalOcean LLAMA droplets to your SSH known hosts
list with:

.. code:: bash

   llama dev dv init

Confirm that no current LLAMA processes are running with (you should see a
total of ``0`` running processes at the end of the command):

.. code:: bash

   llama dev dv ls-procs

Launch background runs for H1L1 (Hanford and Livingston) combined detectors
with:

.. code:: bash

   llama dev dv launch bg \
       /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv

Check to make sure processes are running:

.. code:: bash

   llama dev dv ls-procs

See how many background run files are complete so far:

.. code:: bash

   llama dev dv done bg

Print the most-recent lines of the most-recent logs from each server to see if
anything interesting (read: bad) is happening (can also be useful to start
debugging efforts)

.. code:: bash

   llama dev dv tail

You might need to execute arbitrary code on one or more remote machines. Let's
say you want to peak at the contents of the first background injection on
``llama-bns-o3-00``. You can specify just that droplet and list the contents of
a specific directory with:

.. code:: bash

   llama dev dv -p llama-bns-o3-00 eval \
       'ls /home/vagrant/pre-er14-background/events/injection-0'

You can, of course, use this same method to delete stuff on remote machines,
move stuff, or what-have-you. You can use other ``llama dev dv`` CLI flags to
control things with more granularity. When used with the ``eval`` subcommand,
``llama dev dv`` is just a vectorized, multi-threaded (read: insanely fast)
version of ``ssh`` that's aware of your DigitalOcean naming semantics; use it
to make short work of remote commands that could otherwise be run using ``ssh
foo@your.server 'some command'``.

Now, assuming you've worked out any bugs and that things are flowing smoothly,
sit back and wait until you've collected enough events for some level of
statistical significance. Once you've amassed enough successful background
runs, kill your LLAMA processes on all droplets with:

.. code:: bash

   llama dev dv killall

Confirm all processes are dead (after a few moments):

.. code:: bash

   llama dev dv ls-procs

Collecting Results
~~~~~~~~~~~~~~~~~~

You should run until you have at least a few tens of thousands of events
simulated. Check the progress of e.g. a background event simulation with:

.. code:: bash

   llama dev dv done bg

Once you've simulated a sufficient number of events, you can pull the results
down to your local computer. Anticipate around 1GB per 10,000 simulated events
based on the pipeline architecture at time of writing.
Pull results down and automatically organize them locally with:

.. code:: bash

   llama dev dv pull

Next, create text tables with collated results from your simulations, **making
sure to change the output filename to match the population you are working
with**, by running the following command in the directory in which you just ran
``llama dev dv pull``:

.. code:: bash

   llama dev background table bns-o3-H1L1V1.txt

You will now have a table called ``bns-o3-H1L1V1.txt`` (or something similar
based on the population you simulated). Look at the first 10 lines of this
file and make sure they are somewhat reasonable looking:

.. code:: bash

   $ head bns-o3-H1L1V1.txt
   EVENT-ODDS-RATIO  N  SOURCE-DIRECTORY                      SKYMAP-FILENAME
   =========================================================================================================================
   +2.2912652264e-36  9  llama-bns-o3-65/events/injection-323  /home/vagrant/pre-er14-background/skymaps/bns-o3/8937.fits.gz
    +2.673378655e-25  4  llama-bns-o3-65/events/injection-111  /home/vagrant/pre-er14-background/skymaps/bns-o3/2304.fits.gz
    +1.677508305e-15  8  llama-bns-o3-65/events/injection-129  /home/vagrant/pre-er14-background/skymaps/bns-o3/4798.fits.gz
   +9.3603728371e-09  7  llama-bns-o3-65/events/injection-116  /home/vagrant/pre-er14-background/skymaps/bns-o3/3400.fits.gz
    +9.233942133e-25  3  llama-bns-o3-65/events/injection-324  /home/vagrant/pre-er14-background/skymaps/bns-o3/1865.fits.gz
    +1.565115763e-09  9  llama-bns-o3-65/events/injection-120   /home/vagrant/pre-er14-background/skymaps/bns-o3/699.fits.gz
   +3.4337601072e-09  7  llama-bns-o3-65/events/injection-312  /home/vagrant/pre-er14-background/skymaps/bns-o3/8854.fits.gz
   +9.4312345753e-09  4  llama-bns-o3-65/events/injection-118  /home/vagrant/pre-er14-background/skymaps/bns-o3/7545.fits.gz

It should look something like the above (the whitespace might be different;
ignore that). Most critically, you should see some
reasonable distribution of non-zero neutrino trigger list lengths (the ``N``
column) and different odds ratio values. The source directories and skymap
filenames should also be different and reasonable looking.

Assuming this looks good, you can use these tables to generate the
``populations.hdf5`` test-statistic file using ``llama dev background pvalue``.
You can also zip up these simulation results and upload them to our
DigitalOcean Spaces storage using ``llama dev upload``.