Introduction ============ This guide covers the operation of the pipeline, both in production and for the purpose of running sensitivity studies (simulated background and signal events for various populations). It also describes troubleshooting scenarios and strategies. How Data is Organized ===================== Persistent data goes into ``$XDG_DATA_HOME/llama`` or, if ``$XDG_DATA_HOME`` is not defined, into ``~/.local/share/llama``. This includes analysis outputs, archived alerts, and log files. This directory will be called the "output directory" or ``$OUTPUT_DIR`` in this section. Event Directories ~~~~~~~~~~~~~~~~~ Events each get their own directory whose names are their respective LIGO/Virgo GraceIDs (i.e. their unique event IDs on GraceDB). Event directories for the current LIGO/Virgo run are all held in ``$OUTPUT_DIR/current_run``. Event directories for old events/manual runs, if they are saved, will be in different run directories, each of which should be located in ``$OUTPUT_DIR/past_runs``, so that, for example, GW170817, the world's first BNS merger detection, can be found on the current LLAMA analysis server in ``~/.local/share/llama/past_runs/events-o2/G298048``. GCN Notice Archive ~~~~~~~~~~~~~~~~~~ All GCN notices heard by the server while ``gcnd`` is running will, by default, be archived to ``$OUTPUT_DIR/gcn`` (this includes non-LIGO/Virgo triggers). If you wish to view these alerts, you can look in this directory, which contains subdirectories ``YEAR-MONTH/DAY`` (to avoid having too many files in a single directory). Log Files ~~~~~~~~~ Logs are stored in ``$OUTPUT_DIR/logs``. While formats and contents for different log files differ based on which program created them, they are generally named after the script that created them. The LLAMA update daemon, ``llama run``, writes to logfiles with ``llamad`` in their names; the GCN listener, ``llama listen gcn``, writes to ``gcn.log``. It's often useful to know which log file was written to most recently (e.g. to find out whether the script you ran is faithfully logging output as expected); you can do this with ``ls -ltr``. Running the Pipeline Automatically ================================== *This section will be written later.* Running the Pipeline Manually ============================= Several crucial files can be generated manually (that is to say, without relying on automated trigger acquisition and pipeline execution). This section describes common usage scenarios. Creating a New Event with the skymap_info CLI Script ---------------------------------------------------- ``skymap_info`` is a command line script for creating new event directories along with a new ``skymap_info.json`` file describing the gravitational wave event and specifying the GW skymap that is to be used. Creating this file is the first step in any GW multimessenger analysis in LLAMA (see the documentation for ``llama.files.skymap_info.SkymapInfo`` for more information). Ordinarily this file is generated by one of the listener scripts (e.g. the ones listening for new LVAlerts from LIGO/Virgo's GraceDB or the GCN Notice listener). The ``skymap_info`` script can be used to generate this file manually, however, using either the event's GraceID (it's unique ID on GraceDB) or using a VOEvent XML file (as generated on GraceDB and distributed as a GCN Notice). There are many options associated with making a new ``skymap_info.json`` file (run ``skymap_info -h`` to see full documentation of the command line interface), but by default, all you need to specify is either the GraceID or a VOEvent file and everything else will be inferred. Using Defaults with ``skymap_info`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the simplest case, you can create a new event and its corresponding ``skymap_info.json`` file using an existing VOEvent with: .. code:: bash skymap_info --voevent path/to/voevent.xml or using the event's GraceID (and, implicitly, the most recently uploaded skymap) with: .. code:: bash skymap_info --graceid GRACEID Where ``GRACEID`` can be either a conventional GraceID (starting with "G" for real events, "M" for mock events) or a superevent GracceID (starting with "S" for real superevents, "MS" for mock superevents). In most cases, you will want to run with a superevent GraceID, since this is the official public data product released by LIGO/Virgo. Whether you use ``--voevent`` or ``-graceid``, you will now be prompted to specify whether this event is a real event or a test event. **Please answer this question carefully, as it determines whether results will be automatically published to collaborators.** If it is a real event, data products will go out to collaborators (via Slack and possibly other methods) when you run ``llama run`` to update the rest of the contents of the directory. This is the same behavior you'd get from the pipeline when it's automatically triggering off of real data. Of course, if you're looking at a real event, this is a good thing, so don't be shy about saying ``YES``; just make sure you answer *correctly* so as not to bother or confuse people needlessly. Sensitivity and Background Studies (O3B) ======================================== *These instructions apply starting in the second half of O3 (O3B). See the next section for pre-O3B instructions. You might still want to read those instructions for a more detailed overview of how tools work: this section is currently underdeveloped.* Organizing Data --------------- Make sure that all data is available on DigitalOcean Spaces and is organized into directories by GraceID and background neutrino list names. For mid-O3 subthreshold runs, this data is located in: .. code:: s3://llama/llama/artifacts/mid-o3-subthreshold-background/ Contents at run time: .. code:: DIR s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist/ DIR s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents/ 2019-11-04 00:45 1088890 s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist_names.txt 2019-10-26 20:32 30656 s3://llama/llama/artifacts/mid-o3-subthreshold-background/missing_event_ids_cwb.npy 2019-11-04 00:30 27 s3://llama/llama/artifacts/mid-o3-subthreshold-background/placeholder.json 2019-11-03 23:58 244 s3://llama/llama/artifacts/mid-o3-subthreshold-background/skymap_info.json.template 2019-11-04 00:44 67832 s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents.txt Provisioning Servers -------------------- In principle, this can be run anywhere that has an internet connection, supports Docker (or has LLAMA dependencies and library installed), and can provide 8GB of memory for peak usage. These instructions assume you are using DigitalOcean droplets provisioned from a seed image using ``llama com do``. Start by getting a running server with ``llama`` installed under ``~/dev/multimessenger-pipeline``. Make sure it doesn't need any extra ``$PATH`` configuration in order to work. Create a snapshot of this server. Let's say it's ``mid-o3-subthreshold`` for the sake of provisioning a bunch of servers for the parallel runs. Follow the instructions in the pre-O3B section below to set things up, then launch with: .. code:: bash llama dev dv batch This will run ``bin/mid-o3-subthreshold-background-event-runner`` on all servers whose names start with ``llama-``. Make sure code is running on those servers with: .. code:: bash llama dev dv ls-procs Fetch results from S3 using ``s3cmd`` .. code:: bash s3cmd get --recursive \ s3://llama/llama/artifacts/mid-o3-subthreshold-background/mid-o3-subthreshold-background/ \ . Finally, collate files into a results table as described below and put them into the background distribution file used for calculating p-values. Sensitivity and Background Studies (pre-O3B) ============================================ At time of writing, LLAMA uses DigitalOcean_ hosting for sensitivity studies. There are a set of scripts used for setting up a large (~100) number of identical DigitalOcean "droplets" (virtual private server instances), getting analyses running on them, collating the outputs, and then spinning down those droplets. .. _DigitalOcean: https://cloud.digitalocean.com/ Getting Started --------------- Before you can spin up any servers, you'll need to have someone on the GECo DigitalOcean_ account invite you to join the team. Visit the `team management page `__ and invite the new user to the team. You will need to create a DigitalOcean_ account using the email provided (or log in if you already have one), at which point you will be able to switch users to the GECo team account by clicking the profile button in the top right corner of the DigitalOcean_ web interface. Generating API Tokens --------------------- You'll now need an API token in order to be able to access DigitalOcean using LLAMA's utility scripts (i.e. without using the web interface). **Your access token is basically like a password, and you should treat it with EXTREME care.** If someone gets access to your token, they'll be able to rack up charges on the team server (i.e. by mining crypto currency or something) and even delete our data. Ideally, once you create your token, you should only store it on the computer that will be using it, and you should make sure not to put copies anywhere where they can be abused, e.g. a git repository (a **very** common mistake!!). You should also delete old tokens as long as they are not being used any more or if you have the slightest suspicion that any have been compromised. Go to the `API token page `__ and create a new token, making sure you check the box for ``write`` access. (You'll only have one chance to copy this token down; if you accidentally navigate away from the page, you'll just need to delete that token and generate a new one.) Now, you can add this token to your command line environment by sticking the following line somewhere in your ``.bashrc`` (or ``.bash_profile`` for MacOS): .. code:: bash export DIGITALOCEAN= Where, of course, ```` is replaced with the API token you just generated. Close and reopen your shell window to make sure that this environmental variable is loaded. You can check that it's loaded by running: .. code:: bash echo $DIGITALOCEAN which should print the API token you just created. If it didn't, make sure you edited the correct file (again, MacOS uses ``~/.bash_profile``, Linux uses ``~/.bashrc``), that you wrote the ``export`` command above (with *no spaces around the* ``=`` *sign!*), and that you restarted your ``bash`` shell after saving so that the startup script loaded the new variable. Editing your .bashrc ~~~~~~~~~~~~~~~~~~~~ *Note: your* ``.bashrc`` *or* ``.bash_profile`` *(on MacOS) files are scripts that run whenever* ``bash`` *starts up. They let you do things like set environmental variables (what we're doing now) and other useful things to customize your shell and make it more useful. You can edit your* ``.bashrc`` *or* ``.bash_profile`` *with:* .. code:: bash # On Linux nano ~/.bashrc # On MacOS nano ~/.bash_profile *and then add the same* ``export DIGITALOCEAN=`` *code mentioned above.* Using the DigitalOcean API -------------------------- You can use the command-line utility ``llama com do`` for interacting with the DigitalOcean API through the command line (our scripts for running sensitivity studies rely on this script). If you have the LLAMA pipeline installed, you can find this script in ``/bin`` (where ```` is the location of the LLAMA software directory). Since the scripts we are using are mostly located in ``/bin``, it's probably easiest to clone the full LLAMA repository (if you haven't already). If you already have this code and have ```` in your ``$PATH`` (i.e. you can use the commands it contains at the command line), you can skip the next subsection. LLAMA Scripts ~~~~~~~~~~~~~ All llama scripts used for background runs are part of the ``llama`` distribution's command line tools. Use a Docker image or :ref:`install the pipeline locally ` to get access to them. The two main tools you'll be using are ``llama com do`` to start up DigitalOcean servers/interact with the infrastructure and ``llama dev drop`` to control the servers. Adding Your SSH Keys to the Droplets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You will need to be able to SSH into the droplets where the analysis is being run in order to get the analyses started, check on their statuses, and fetch output data. In order to do this, you will need to give an SSH key (sort of like a password that is unique to your computer) to DigitalOcean_ under the GECo account. By default, ``llama com do`` will add all available GECo SSH keys to newly provisioned droplets (so that all team members can access analysis servers at will). For starters, you'll have to :ref:`make SSH keys ` if you haven't already. If you put your key in the default location, you can print it to the terminal with: .. code:: bash cat ~/.ssh/id_rsa.pub Next, make sure you're logged in as GECo on DigitalOcean_, then copy the above key and navigate to the `DigitalOcean security page`_ to add your SSH key. Create a new SSH key and name it something that makes it clear who you are and which device you are using (e.g. ``Stef-Laptop``). .. _`DigitalOcean security page`: https://cloud.digitalocean.com/account/security Once you've saved your SSH key, you'll have automatic access to all future droplets created through the ``llama com do`` interface. Installing Requirements ~~~~~~~~~~~~~~~~~~~~~~~ The core CLI tools are part of the LLAMA distribution, though you will also need to have the developer packages installed (defined in ``requirements-dev.txt`` and as Conda packages in the Dockerfiles in ``/docker/llama-dev/``). You can use the developer Docker images to satisfy the non-python dependencies or install them on your local system. bash 4.X (MacOS) ^^^^^^^^^^^^^^^^ The scripts and examples in this guide were written with ``bash`` 4 in mind. If you are on any recent Linux system, you will almost certainly be running ``bash`` 4 or later. MacOS systems, however, come with ``bash`` 3 installed by default (due to licensing issues, they cannot distribute newer versions). To install the latest version of ``bash`` on MacOS: 1. Install MacPorts_ using their instructions. 2. Install ``bash`` with ``sudo port install bash``. 3. Make a copy of the old list of shells (used by the system to keep track of available shells) with ``cp /etc/shells ~/etc-shells``. 4. Edit the list of shell files with ``sudo nano /etc/shells`` and add ``/opt/local/bin/bash`` on a new line. Feel free to use a different text editor than ``nano``, e.g. the default graphical text editor with ``sudo open /etc/shells``. 5. Run ``chsh -s /opt/local/bin/bash`` to tell your computer that the MacPorts version of ``bash`` is the one you would like to use by default. You will need to enter your password to do this. 6. Without closing your current shell window, open a new window and run ``echo $BASH_VERSION``. You should see something starting with "4". .. _MacPorts: https://www.macports.org If things look weird, or if the terminal doesn't start properly, go back to the old shell window you were using, restore the old version of the ``/etc/shells`` file (in case you broke it) by running ``sudo cp ~/etc-shells /etc/shells`` followed by ``chsh -s /bin/bash`` to go back to the default system ``bash``. Open a new window to confirm that you've fixed the damage (everything should now be back to normal), then carefully retry the above steps. DigitalOcean Administration Examples ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Show full help documentation with: .. code:: bash llama com do -h You can list the current running droplets using: .. code:: bash llama com do -D If you only want to list droplets whose name start with "llama", you can add a wildcard to the end of the command (as if you were adding a wildcard to match a bunch of files in a folder): .. code:: bash llama com do -D llama* You can also print just the IP addresses of these servers by telling ``llama com do`` to print only those columns (you can choose any combination of columns besides this example; see the help documentation with the ``-h`` flag for a full list): .. code:: bash llama com do -D llama* -C ip You can get rid of the header row that labels the columns (useful if you are going to pipe those IP addresses to another script) by adding the ``-n`` (no headers) flag: .. code:: bash llama com do -n -D llama* -C ip You can list the currently saved snapshots on your DigitalOcean_ account with: .. code:: bash llama com do -S .. _provision-single-droplet: Create a single droplet called ``llama-provision`` from the snapshot called ``llama-parallel-early-er14-o2-bg`` with: .. code:: bash llama com do -c llama-provision -i llama-parallel-early-er14-o2-bg You can create 98 servers numbered ``llama-parallel-00`` through ``llama-parallel-97`` (0 included) using the ``llama-parallel-early-er14-o2-bg`` snapshot with: .. code:: bash llama com do \ -c llama-parallel-{00..97} \ -i llama-parallel-early-er14-o2-bg You can delete all servers whose names start with "llama" with the ``-d`` flag; you'll be asked for final confirmation before the deletion happens, so make sure to double check that the servers listed are the ones you want to actually delete (**deleting droplets is NOT reversible!**): .. code:: bash llama com do -d llama* *(be sure to type "y" in the prompt to confirm that you want to create the new droplets; note that it will take a while for them to start up. You can monitor their progress by running* ``llama com do -D`` *as mentioned above; droplets that will be listed as* ``new`` *while they are provisioning, and you will not be able to log into them or otherwise use them until they are finished starting up and are listed as* ``active``. Preparing a Server ------------------ When you run this analysis in parallel, you'll be doing it on a bunch of different virtual servers. In order to keep things as simple as possible, you want these servers to be identical in all respects (except, possibly, for input files, which will need to be different if you're not doing some sort of Monte Carlo sampling like we do in the background studies). In particular, you'll want to copying large files to and from the individual servers; a 7GB list of skymaps will eat up a whole day on a good connection if you need to copy it from Columbia's network to 100 separate DigitalOcean_ servers. Server Preparation Overview ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The solution (described in detail where necessary in the links below) is to: 1. Create a single droplet, most likely cloning it from the disk snapshot used for the most recent parallel run (see the :ref:`code example ` above) 2. Set everything up on it by moving data files into place and delete old output files; see the :ref:`section on moving files into place ` below. 3. Shut it down at the command line with ``sudo shutdown now``. 4. Take a snapshot of the droplet by opening it on the DigitalOcean_ control panel, going to the snapshot page, entering a descriptive name (try to make it clear when this run happened and which set of skymaps it was run on), and clicking the make snapshot button. Making the snapshot will take ~1 hour, so go do something else while you wait. Once you've done these tasks, you can :ref:`spin up servers ` and :ref:`get the analysis started `. Step 2 is the most complicated and most subject to change; the following section describes it in detail. .. _moving-data-into-place: Moving Data into Place ~~~~~~~~~~~~~~~~~~~~~~ Where to Get Simulated Skymaps ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *Note: this whole section will be run on some LIGO DataGrid server. You will need to SSH into it with* ``gsissh``. At time of writing, simulated skymaps are located on ``ldas-pcdev1.ligo-la.caltech.edu``. Enter your LIGO credentials with: .. code:: bash kinit albert.einstein@LIGO.ORG And enter your LIGO password (obviously replace "albert.einstein" with your LIGO username; you'll need DataGrid credentials for this, so make sure to set those up if you haven't already). Now create a grid proxy from the Kerberos credentials initialized above with: .. code:: bash ligo-proxy-init -k Now, you can SSH into the LIGO DataGrid mentioned above: .. code:: bash gsissh ldas-pcdev1.ligo-la.caltech.edu (Alternatively, if you are logging in with ``ssh albert.einstein@ssh.ligo.org``, you would select ``LLO`` and then ``pcdev1``.) Navigate to the directory where Rainer has been keeping skymaps: .. code:: bash $ cd /home/kenneth.corley/GWHEN_sample_skymaps $ ls BBH_design_sample_skymaps BBH_O3_sample_skymaps BNS_O3_sample_skymaps BBH_O2_sample_skymaps BNS_design_sample_skymaps README.txt You should see a list of directories whose names describe the type of skymaps contained. Skymaps should be in a directory called something like ``allsky`` within one of these directories. Sometimes the skymaps are split up between multiple runs. For example, to get the O3 sensitivity above-threshold BNS skymaps, run: .. code:: bash cd BNS_O3_sample_skymaps/BNS_O3_1/allsky There should be very many (8978) ``.fits`` skymap files (along with other output files generated by the skymap simulations) in this directory. (We'll only zip up ``BNS_O3_1`` since most of the skymaps are in this directory; we will ignore ``BNS_O3_2`` for now.) You will need to compress these skymaps into ``.fits.gz`` files (the format required by the LLAMA pipeline). You can't write to Rainer's directories, so you'll need to make a directory in your own home directory to zip these files to (let's call it ``BNS_O3_sample_skymaps`` for clarity). Run the following command from the ``allsky`` folder above: .. code:: bash mkdir ~/BNS_O3_sample_skymaps # run this next line from within the directory containing the skymaps # you're zipping find . -name '*.fits' | while read fitsfile; do echo "On ${fitsfile}..."; gzip \ <"${fitsfile}" \ >~/BNS_O3_sample_skymaps/"$(basename "${fitsfile}")".gz; done (This will take a while to complete.) Once you've zipped these files, you can take that output directory and *zip it up again* into a compressed tarball (``.tar.gz`` extension). This will put the whole directory into a single file, which makes it easier to move around (the final file will be ~10GB in size). .. code:: bash cd tar -czf BNS_O3_sample_skymaps.tar.gz BNS_O3_sample_skymaps You now have a tar archive of the skymaps you want to use! Congratulations. You'll probably want to download this directly from the server you are provisioning (since the connection between DigitalOcean's servers and LIGO's data centers is probably faster than your home connection). Read on for instructions on where to move the files on the DigitalOcean server you are provisioning. Copying Skymaps into a Droplet ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *This section should be run on the DigitalOcean droplet you are provisioning.* SSH into the DigitalOcean_ server you've created for provisioning purposes. All files for background and sensitivity studies are located in ``~/pre-er14-background`` (for historical reasons). Navigate there and list the contents: .. code:: bash $ cd ~/pre-er14-background $ ls bklist neutrinos signal-events events scratchevents signal-neutrinos LOCKDIR signal-doublet-events skymap-lists logs signal-doublet-neutrinos skymaps These directories contain inputs and outputs for the background and sensitivity studies. We are particularly interested in: - ``bklist``: Lists of neutrinos used for background calculations - ``skymaps``: A directory containing subdirectories filled with skymaps (since you might want to lump together a few runs worth of simulated skymaps, each in a separate directory, into a single background run) used by both background and sensitivity runs. For example, you might put a directory called ``allsky`` filled with ``.fits.gz`` skymaps into ``skymaps``, so that a typical input skymap called ``0.fits.gz`` would be located at ``~/pre-er14-background/skymaps/allsky/0.fits.gz``. - ``skymap-lists``: Lists of filepaths to skymaps in ``skymaps`` that specify subsets of the full set of skymaps. These are used to distinguish between 2- and 3-detector cases; each list will have skymaps that correspond to a specific set of GW detectors that participated in the reconstruction of that skymap. When you :ref:`start the analysis `, you can specify one of these lists to only run on skymaps corresponding to that detector configuration. These skymap lists contain one full skymap path per line. You can make these skymap lists by downloading the ``coincs.dat`` file from Rainer's directories on ``ldas-pcdev1.ligo-la.caltech.edu``. The fastest way to get these and the skymaps is do download them from our DigitalOcean Spaces directories; simulated skymaps are located in subdirectories of ``https://llama.nyc3.digitaloceanspaces.com/simulations/ligo-la/~kenneth.corley`` with names like ``BNS_O3_sample_skymaps.tar.gz`` providing discriptive titles (you will also need the filename hash, so the easiest way to do this is to list files in that subdirectory through a command-line interface or the Spaces web-browser). Within the unzipped tarfile, you'll be able to find the ``coincs.dat`` files in the specific simulation run you want to use; in the above example, that would be in the ``BNS_O3_1`` subdirectory. For reference, on the LLO LDG servers (``ldas-pcdev1.ligo-la.caltech.edu``), ``coincs.dat`` would be in the ``/home/kenneth.corley/GWHEN_sample_skymaps/BNS_O3_sample_skymaps/BNS_O3_1`` directory. Once you have coincs.dat, you can extract the skymap filename lists with something like: .. code:: bash for dets in H1,L1 H1,L1,V1 H1,V1 L1,V1; do START="$dets" python -c 'import os; open("../skymap-lists/bns-o3/"+os.environ["START"].replace(",", "")+".csv", "w").write("\n".join(f"/home/vagrant/pre-er14-background/skymaps/{l.split()[1]}.fits.gz" for l in open("participating-detectors-bns-o3.txt").readlines() if l.split()[0] == os.environ["START"])+"\n")'; done confirm that the *total* number of filenames matches the number of filenames split between the lists (though this list's length should be one greater because it contains an initial header line): .. code:: bash wc -l ~/pre-er14-background//skymap-lists/bns-o3/* 2013 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv 5507 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1V1.csv 703 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1V1.csv 755 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/L1V1.csv 8978 total wc -l ~/pre-er14-background/skymaps/coincs-bns-o3.dat 8979 coincs-bns-o3.dat The file contents should look something like this: .. code:: bash $ head ~/pre-er14-background/skymap-lists/bns-o3/H1L1V1.csv /home/vagrant/pre-er14-background/skymaps/bns-o3/1.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/2.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/3.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/4.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/5.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/6.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/8.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/16.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/17.fits.gz - ``logs``: Contains logfiles for each run; background logfiles are just a timestamp (an integer) followed by ``.log`` (with the largest number representing the latest timestamp, and hence the most-recently started run). - ``events``: Output run directory containing one directory for each simulated background trigger (i.e. a GW skymap with a list of neutrinos plus all outputs of the analysis). .. _running-the-analysis-in-parallel: Running the Analysis in Parallel -------------------------------- .. _starting-the-servers: Starting the Servers ~~~~~~~~~~~~~~~~~~~~ See how many servers you currently have running with: .. code:: bash llama com do -D | wc -l List all server names with: .. code:: bash llama com do -D You should probably create a single server to begin. You can make sure that parallel operations work on this server before you save a DigitalOcean snapshot of it; create a single server using the command below and :ref:`start the analysis ` for that single server to see if any bugs pop up. (If you haven't run the analysis in parallel in a while, there might be bugs; if you've made major architectural changes since the last background run, there are almost certainly bugs.) .. code:: bash llama com do -c llama-parallel-bns-o3-00 -i llama-parallel-bns-o3 Once you've debugged things on this server and gotten the analysis working, you can update the base image (``llama-parallel-bns-o3`` in this example) through `DigitalOcean's web interface `__ from the working copy and spin up many more servers. Let's say you're limited to 100 servers and are already using a few, so you want to spin up 94 more instances named ``llama-parallel-bns-03-00`` through ``llama-parallel-bns-03-94`` (assuming you're keeping the image you created in the previous step, you only need to create the remaining 94 images): .. code:: bash llama com do -c llama-parallel-bns-o3-{01..94} -i llama-parallel-bns-o3 ``llama dev dv`` (and any vestigial ``pre-er-14-*`` CLI utilities that have not yet been ported to ``llama dev dv``) are automatically vectorized, so the following commands should apply equally-well regardless of whether you are running one server or 1,000. Outside of edge-case bugs, things should run smoothly with many servers running once you've fixed things in the single-server case. .. _starting-the-analysis: Running the Analysis ~~~~~~~~~~~~~~~~~~~~ If everything has been set up properly, you can start a background run. First, add all of the existing DigitalOcean LLAMA droplets to your SSH known hosts list with: .. code:: bash llama dev dv init Confirm that no current LLAMA processes are running with (you should see a total of ``0`` running processes at the end of the command): .. code:: bash llama dev dv ls-procs Launch background runs for H1L1 (Hanford and Livingston) combined detectors with: .. code:: bash llama dev dv launch bg \ /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv Check to make sure processes are running: .. code:: bash llama dev dv ls-procs See how many background run files are complete so far: .. code:: bash llama dev dv done bg Print the most-recent lines of the most-recent logs from each server to see if anything interesting (read: bad) is happening (can also be useful to start debugging efforts) .. code:: bash llama dev dv tail You might need to execute arbitrary code on one or more remote machines. Let's say you want to peak at the contents of the first background injection on ``llama-bns-o3-00``. You can specify just that droplet and list the contents of a specific directory with: .. code:: bash llama dev dv -p llama-bns-o3-00 eval \ 'ls /home/vagrant/pre-er14-background/events/injection-0' You can, of course, use this same method to delete stuff on remote machines, move stuff, or what-have-you. You can use other ``llama dev dv`` CLI flags to control things with more granularity. When used with the ``eval`` subcommand, ``llama dev dv`` is just a vectorized, multi-threaded (read: insanely fast) version of ``ssh`` that's aware of your DigitalOcean naming semantics; use it to make short work of remote commands that could otherwise be run using ``ssh foo@your.server 'some command'``. Now, assuming you've worked out any bugs and that things are flowing smoothly, sit back and wait until you've collected enough events for some level of statistical significance. Once you've amassed enough successful background runs, kill your LLAMA processes on all droplets with: .. code:: bash llama dev dv killall Confirm all processes are dead (after a few moments): .. code:: bash llama dev dv ls-procs Collecting Results ~~~~~~~~~~~~~~~~~~ You should run until you have at least a few tens of thousands of events simulated. Check the progress of e.g. a background event simulation with: .. code:: bash llama dev dv done bg Once you've simulated a sufficient number of events, you can pull the results down to your local computer. Anticipate around 1GB per 10,000 simulated events based on the pipeline architecture at time of writing. Pull results down and automatically organize them locally with: .. code:: bash llama dev dv pull Next, create text tables with collated results from your simulations, **making sure to change the output filename to match the population you are working with**, by running the following command in the directory in which you just ran ``llama dev dv pull``: .. code:: bash llama dev background table bns-o3-H1L1V1.txt You will now have a table called ``bns-o3-H1L1V1.txt`` (or something similar based on the population you simulated). Look at the first 10 lines of this file and make sure they are somewhat reasonable looking: .. code:: bash $ head bns-o3-H1L1V1.txt EVENT-ODDS-RATIO N SOURCE-DIRECTORY SKYMAP-FILENAME ========================================================================================================================= +2.2912652264e-36 9 llama-bns-o3-65/events/injection-323 /home/vagrant/pre-er14-background/skymaps/bns-o3/8937.fits.gz +2.673378655e-25 4 llama-bns-o3-65/events/injection-111 /home/vagrant/pre-er14-background/skymaps/bns-o3/2304.fits.gz +1.677508305e-15 8 llama-bns-o3-65/events/injection-129 /home/vagrant/pre-er14-background/skymaps/bns-o3/4798.fits.gz +9.3603728371e-09 7 llama-bns-o3-65/events/injection-116 /home/vagrant/pre-er14-background/skymaps/bns-o3/3400.fits.gz +9.233942133e-25 3 llama-bns-o3-65/events/injection-324 /home/vagrant/pre-er14-background/skymaps/bns-o3/1865.fits.gz +1.565115763e-09 9 llama-bns-o3-65/events/injection-120 /home/vagrant/pre-er14-background/skymaps/bns-o3/699.fits.gz +3.4337601072e-09 7 llama-bns-o3-65/events/injection-312 /home/vagrant/pre-er14-background/skymaps/bns-o3/8854.fits.gz +9.4312345753e-09 4 llama-bns-o3-65/events/injection-118 /home/vagrant/pre-er14-background/skymaps/bns-o3/7545.fits.gz It should look something like the above (the whitespace might be different; ignore that). Most critically, you should see some reasonable distribution of non-zero neutrino trigger list lengths (the ``N`` column) and different odds ratio values. The source directories and skymap filenames should also be different and reasonable looking. Assuming this looks good, you can use these tables to generate the ``populations.hdf5`` test-statistic file using ``llama dev background pvalue``. You can also zip up these simulation results and upload them to our DigitalOcean Spaces storage using ``llama dev upload``.