Introduction¶
This guide covers the operation of the pipeline, both in production and for the purpose of running sensitivity studies (simulated background and signal events for various populations). It also describes troubleshooting scenarios and strategies.
How Data is Organized¶
Persistent data goes into $XDG_DATA_HOME/llama
or, if $XDG_DATA_HOME
is
not defined, into ~/.local/share/llama
. This includes analysis outputs,
archived alerts, and log files. This directory will be called the “output
directory” or $OUTPUT_DIR
in this section.
Event Directories¶
Events each get their own directory whose names are their
respective LIGO/Virgo GraceIDs (i.e. their unique event IDs on GraceDB). Event
directories for the current LIGO/Virgo run are all held in
$OUTPUT_DIR/current_run
. Event directories for old events/manual runs, if
they are saved, will be in different run directories, each of which should be
located in $OUTPUT_DIR/past_runs
, so that, for example, GW170817, the
world’s first BNS merger detection, can be found on the current LLAMA analysis
server in ~/.local/share/llama/past_runs/events-o2/G298048
.
GCN Notice Archive¶
All GCN notices heard by the server while gcnd
is running will, by default,
be archived to $OUTPUT_DIR/gcn
(this includes non-LIGO/Virgo triggers). If
you wish to view these alerts, you can look in this directory, which contains
subdirectories YEAR-MONTH/DAY
(to avoid having too many files in a single
directory).
Log Files¶
Logs are stored in $OUTPUT_DIR/logs
. While formats and contents for
different log files differ based on which program created them, they are
generally named after the script that created them. The LLAMA update daemon,
llama run
, writes to logfiles with llamad
in their names; the GCN
listener, llama listen gcn
, writes to gcn.log
. It’s often useful to
know which log file was written to most recently (e.g. to find out whether the
script you ran is faithfully logging output as expected); you can do this with
ls -ltr
.
Running the Pipeline Automatically¶
This section will be written later.
Running the Pipeline Manually¶
Several crucial files can be generated manually (that is to say, without relying on automated trigger acquisition and pipeline execution). This section describes common usage scenarios.
skymap_info
is a command line script for creating new event directories
along with a new skymap_info.json
file describing the gravitational wave
event and specifying the GW skymap that is to be used. Creating this file is
the first step in any GW multimessenger analysis in LLAMA (see the
documentation for llama.files.skymap_info.SkymapInfo
for more
information).
Ordinarily this file is generated by one of the listener scripts (e.g. the ones
listening for new LVAlerts from LIGO/Virgo’s GraceDB or the GCN Notice
listener). The skymap_info
script can be used to generate this file
manually, however, using either the event’s GraceID (it’s unique ID on GraceDB)
or using a VOEvent XML file (as generated on GraceDB and distributed as a GCN
Notice). There are many options associated with making a new
skymap_info.json
file (run skymap_info -h
to see full documentation of
the command line interface), but by default, all you need to specify is either
the GraceID or a VOEvent file and everything else will be inferred.
Using Defaults with skymap_info
¶
In the simplest case, you can create a new event and its corresponding
skymap_info.json
file using an existing VOEvent with:
skymap_info --voevent path/to/voevent.xml
or using the event’s GraceID (and, implicitly, the most recently uploaded skymap) with:
skymap_info --graceid GRACEID
Where GRACEID
can be either a conventional GraceID (starting with “G” for
real events, “M” for mock events) or a superevent GracceID (starting with “S”
for real superevents, “MS” for mock superevents). In most cases, you will want
to run with a superevent GraceID, since this is the official public data
product released by LIGO/Virgo.
Whether you use --voevent
or -graceid
, you will now be prompted to
specify whether this event is a real event or a test event. Please answer
this question carefully, as it determines whether results will be automatically
published to collaborators. If it is a real event, data products will go out
to collaborators (via Slack and possibly other methods) when you run llama
run
to update the rest of the contents of the directory. This is the same
behavior you’d get from the pipeline when it’s automatically triggering off of
real data. Of course, if you’re looking at a real event, this is a good thing,
so don’t be shy about saying YES
; just make sure you answer correctly so
as not to bother or confuse people needlessly.
Sensitivity and Background Studies (O3B)¶
These instructions apply starting in the second half of O3 (O3B). See the next section for pre-O3B instructions. You might still want to read those instructions for a more detailed overview of how tools work: this section is currently underdeveloped.
Make sure that all data is available on DigitalOcean Spaces and is organized into directories by GraceID and background neutrino list names. For mid-O3 subthreshold runs, this data is located in:
s3://llama/llama/artifacts/mid-o3-subthreshold-background/
Contents at run time:
DIR s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist/
DIR s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents/
2019-11-04 00:45 1088890 s3://llama/llama/artifacts/mid-o3-subthreshold-background/bklist_names.txt
2019-10-26 20:32 30656 s3://llama/llama/artifacts/mid-o3-subthreshold-background/missing_event_ids_cwb.npy
2019-11-04 00:30 27 s3://llama/llama/artifacts/mid-o3-subthreshold-background/placeholder.json
2019-11-03 23:58 244 s3://llama/llama/artifacts/mid-o3-subthreshold-background/skymap_info.json.template
2019-11-04 00:44 67832 s3://llama/llama/artifacts/mid-o3-subthreshold-background/superevents.txt
In principle, this can be run anywhere that has an internet connection,
supports Docker (or has LLAMA dependencies and library installed), and can
provide 8GB of memory for peak usage. These instructions assume you are using
DigitalOcean droplets provisioned from a seed image using llama com do
.
Start by getting a running server with llama
installed under
~/dev/multimessenger-pipeline
. Make sure it doesn’t need any extra
$PATH
configuration in order to work. Create a snapshot of this server.
Let’s say it’s mid-o3-subthreshold
for the sake of provisioning a bunch of
servers for the parallel runs.
Follow the instructions in the pre-O3B section below to set things up, then launch with:
llama dev dv batch
This will run bin/mid-o3-subthreshold-background-event-runner
on all
servers whose names start with llama-
. Make sure code is running on those
servers with:
llama dev dv ls-procs
Fetch results from S3 using s3cmd
s3cmd get --recursive \
s3://llama/llama/artifacts/mid-o3-subthreshold-background/mid-o3-subthreshold-background/ \
.
Finally, collate files into a results table as described below and put them into the background distribution file used for calculating p-values.
Sensitivity and Background Studies (pre-O3B)¶
At time of writing, LLAMA uses DigitalOcean hosting for sensitivity studies. There are a set of scripts used for setting up a large (~100) number of identical DigitalOcean “droplets” (virtual private server instances), getting analyses running on them, collating the outputs, and then spinning down those droplets.
Before you can spin up any servers, you’ll need to have someone on the GECo DigitalOcean account invite you to join the team. Visit the team management page and invite the new user to the team. You will need to create a DigitalOcean account using the email provided (or log in if you already have one), at which point you will be able to switch users to the GECo team account by clicking the profile button in the top right corner of the DigitalOcean web interface.
You’ll now need an API token in order to be able to access DigitalOcean using LLAMA’s utility scripts (i.e. without using the web interface). Your access token is basically like a password, and you should treat it with EXTREME care. If someone gets access to your token, they’ll be able to rack up charges on the team server (i.e. by mining crypto currency or something) and even delete our data. Ideally, once you create your token, you should only store it on the computer that will be using it, and you should make sure not to put copies anywhere where they can be abused, e.g. a git repository (a very common mistake!!). You should also delete old tokens as long as they are not being used any more or if you have the slightest suspicion that any have been compromised.
Go to the API token page and
create a new token, making sure you check the box for write
access. (You’ll
only have one chance to copy this token down; if you accidentally navigate away
from the page, you’ll just need to delete that token and generate a new one.)
Now, you can add this token to your command line environment by sticking the
following line somewhere in your .bashrc
(or .bash_profile
for MacOS):
export DIGITALOCEAN=<YOUR-TOKEN-HERE>
Where, of course, <YOUR-TOKEN-HERE>
is replaced with the API token you just
generated. Close and reopen your shell window to make sure that this
environmental variable is loaded. You can check that it’s loaded by running:
echo $DIGITALOCEAN
which should print the API token you just created. If it didn’t, make sure you
edited the correct file (again, MacOS uses ~/.bash_profile
, Linux uses
~/.bashrc
), that you wrote the export
command above (with no spaces
around the =
sign!), and that you restarted your bash
shell after
saving so that the startup script loaded the new variable.
Editing your .bashrc¶
Note: your .bashrc
or .bash_profile
(on MacOS) files are scripts
that run whenever bash
starts up. They let you do things like set
environmental variables (what we’re doing now) and other useful things to
customize your shell and make it more useful. You can edit your .bashrc
or .bash_profile
with:
# On Linux
nano ~/.bashrc
# On MacOS
nano ~/.bash_profile
and then add the same export DIGITALOCEAN=<YOUR-TOKEN-HERE>
code
mentioned above.
Using the DigitalOcean API¶
You can use the command-line utility llama com do
for interacting with the
DigitalOcean API through the command line (our scripts for running sensitivity
studies rely on this script). If you have the LLAMA pipeline installed, you can
find this script in <REPODIR>/bin
(where <REPODIR>
is the location of
the LLAMA software directory).
Since the scripts we are using are mostly located in <REPODIR>/bin
, it’s
probably easiest to clone the full LLAMA repository (if you haven’t already).
If you already have this code and have <REPODIR/bin>
in your $PATH
(i.e. you can use the commands it contains at the command line), you can skip
the next subsection.
LLAMA Scripts¶
All llama scripts used for background runs are part of the llama
distribution’s command line tools. Use a Docker image or install the
pipeline locally to get access to them. The two main
tools you’ll be using are llama com do
to start up DigitalOcean
servers/interact with the infrastructure and llama dev drop
to control the
servers.
Adding Your SSH Keys to the Droplets¶
You will need to be able to SSH into the droplets where the analysis is being
run in order to get the analyses started, check on their statuses, and fetch
output data. In order to do this, you will need to give an SSH key (sort of
like a password that is unique to your computer) to DigitalOcean under the
GECo account. By default, llama com do
will add all available GECo
SSH keys to newly provisioned droplets (so that all team members can access
analysis servers at will).
For starters, you’ll have to make SSH keys if you haven’t already. If you put your key in the default location, you can print it to the terminal with:
cat ~/.ssh/id_rsa.pub
Next, make sure you’re logged in as GECo on DigitalOcean, then copy the above
key and navigate to the DigitalOcean security page to add your SSH key.
Create a new SSH key and name it something that makes it clear who you are and
which device you are using (e.g. Stef-Laptop
).
Once you’ve saved your SSH key, you’ll have automatic access to all future
droplets created through the llama com do
interface.
Installing Requirements¶
The core CLI tools are part of the LLAMA distribution, though you will also
need to have the developer packages installed (defined in
requirements-dev.txt
and as Conda packages in the Dockerfiles in
/docker/llama-dev/
). You can use the developer Docker images to satisfy the
non-python dependencies or install them on your local system.
The scripts and examples in this guide were written with bash
4 in mind. If
you are on any recent Linux system, you will almost certainly be running
bash
4 or later. MacOS systems, however, come with bash
3 installed by
default (due to licensing issues, they cannot distribute newer versions). To
install the latest version of bash
on MacOS:
Install MacPorts using their instructions.
Install
bash
withsudo port install bash
.Make a copy of the old list of shells (used by the system to keep track of available shells) with
cp /etc/shells ~/etc-shells
.Edit the list of shell files with
sudo nano /etc/shells
and add/opt/local/bin/bash
on a new line. Feel free to use a different text editor thannano
, e.g. the default graphical text editor withsudo open /etc/shells
.Run
chsh -s /opt/local/bin/bash
to tell your computer that the MacPorts version ofbash
is the one you would like to use by default. You will need to enter your password to do this.Without closing your current shell window, open a new window and run
echo $BASH_VERSION
. You should see something starting with “4”.
If things look weird, or if the terminal doesn’t start properly, go back to the
old shell window you were using, restore the old version of the /etc/shells
file (in case you broke it) by running sudo cp ~/etc-shells /etc/shells
followed by chsh -s /bin/bash
to go back to the default system bash
.
Open a new window to confirm that you’ve fixed the damage (everything should
now be back to normal), then carefully retry the above steps.
DigitalOcean Administration Examples¶
Show full help documentation with:
llama com do -h
You can list the current running droplets using:
llama com do -D
If you only want to list droplets whose name start with “llama”, you can add a wildcard to the end of the command (as if you were adding a wildcard to match a bunch of files in a folder):
llama com do -D llama*
You can also print just the IP addresses of these servers by telling
llama com do
to print only those columns (you can choose any
combination of columns besides this example; see the help documentation with
the -h
flag for a full list):
llama com do -D llama* -C ip
You can get rid of the header row that labels the columns (useful if you are
going to pipe those IP addresses to another script) by adding the -n
(no
headers) flag:
llama com do -n -D llama* -C ip
You can list the currently saved snapshots on your DigitalOcean account with:
llama com do -S
Create a single droplet called llama-provision
from the snapshot
called llama-parallel-early-er14-o2-bg
with:
llama com do -c llama-provision -i llama-parallel-early-er14-o2-bg
You can create 98 servers numbered llama-parallel-00
through
llama-parallel-97
(0 included) using the
llama-parallel-early-er14-o2-bg
snapshot with:
llama com do \
-c llama-parallel-{00..97} \
-i llama-parallel-early-er14-o2-bg
You can delete all servers whose names start with “llama” with the -d
flag;
you’ll be asked for final confirmation before the deletion happens, so make
sure to double check that the servers listed are the ones you want to actually
delete (deleting droplets is NOT reversible!):
llama com do -d llama*
(be sure to type “y” in the prompt to confirm that you want to create the new
droplets; note that it will take a while for them to start up. You can monitor
their progress by running llama com do -D
as mentioned above;
droplets that will be listed as new
while they are provisioning, and you
will not be able to log into them or otherwise use them until they are finished
starting up and are listed as active
.
Preparing a Server¶
When you run this analysis in parallel, you’ll be doing it on a bunch of different virtual servers. In order to keep things as simple as possible, you want these servers to be identical in all respects (except, possibly, for input files, which will need to be different if you’re not doing some sort of Monte Carlo sampling like we do in the background studies). In particular, you’ll want to copying large files to and from the individual servers; a 7GB list of skymaps will eat up a whole day on a good connection if you need to copy it from Columbia’s network to 100 separate DigitalOcean servers.
Server Preparation Overview¶
The solution (described in detail where necessary in the links below) is to:
Create a single droplet, most likely cloning it from the disk snapshot used for the most recent parallel run (see the code example above)
Set everything up on it by moving data files into place and delete old output files; see the section on moving files into place below. 3. Shut it down at the command line with
sudo shutdown now
.
Take a snapshot of the droplet by opening it on the DigitalOcean control panel, going to the snapshot page, entering a descriptive name (try to make it clear when this run happened and which set of skymaps it was run on), and clicking the make snapshot button. Making the snapshot will take ~1 hour, so go do something else while you wait.
Once you’ve done these tasks, you can spin up servers and get the analysis started. Step 2 is the most complicated and most subject to change; the following section describes it in detail.
Moving Data into Place¶
Note: this whole section will be run on some LIGO DataGrid server. You will
need to SSH into it with gsissh
.
At time of writing, simulated skymaps are located on
ldas-pcdev1.ligo-la.caltech.edu
. Enter your LIGO credentials with:
kinit albert.einstein@LIGO.ORG
And enter your LIGO password (obviously replace “albert.einstein” with your LIGO username; you’ll need DataGrid credentials for this, so make sure to set those up if you haven’t already). Now create a grid proxy from the Kerberos credentials initialized above with:
ligo-proxy-init -k
Now, you can SSH into the LIGO DataGrid mentioned above:
gsissh ldas-pcdev1.ligo-la.caltech.edu
(Alternatively, if you are logging in with ssh albert.einstein@ssh.ligo.org
,
you would select LLO
and then pcdev1
.) Navigate to the directory
where Rainer has been keeping skymaps:
$ cd /home/kenneth.corley/GWHEN_sample_skymaps
$ ls
BBH_design_sample_skymaps BBH_O3_sample_skymaps BNS_O3_sample_skymaps
BBH_O2_sample_skymaps BNS_design_sample_skymaps README.txt
You should see a list of directories whose names describe the type of skymaps
contained. Skymaps should be in a directory called something like allsky
within one of these directories. Sometimes the skymaps are split up between
multiple runs. For example, to get the O3 sensitivity above-threshold BNS
skymaps, run:
cd BNS_O3_sample_skymaps/BNS_O3_1/allsky
There should be very many (8978) .fits
skymap files (along with other
output files generated by the skymap simulations) in this directory. (We’ll
only zip up BNS_O3_1
since most of the skymaps are in this directory; we will
ignore BNS_O3_2
for now.) You will need to compress these skymaps into
.fits.gz
files (the format required by the LLAMA pipeline). You can’t write
to Rainer’s directories, so you’ll need to make a directory in your own home
directory to zip these files to (let’s call it BNS_O3_sample_skymaps
for
clarity). Run the following command from the allsky
folder above:
mkdir ~/BNS_O3_sample_skymaps
# run this next line from within the directory containing the skymaps
# you're zipping
find . -name '*.fits' | while read fitsfile; do
echo "On ${fitsfile}...";
gzip \
<"${fitsfile}" \
>~/BNS_O3_sample_skymaps/"$(basename "${fitsfile}")".gz;
done
(This will take a while to complete.) Once you’ve zipped these files, you can
take that output directory and zip it up again into a compressed tarball
(.tar.gz
extension). This will put the whole directory into a single file,
which makes it easier to move around (the final file will be ~10GB in size).
cd
tar -czf BNS_O3_sample_skymaps.tar.gz BNS_O3_sample_skymaps
You now have a tar archive of the skymaps you want to use! Congratulations. You’ll probably want to download this directly from the server you are provisioning (since the connection between DigitalOcean’s servers and LIGO’s data centers is probably faster than your home connection). Read on for instructions on where to move the files on the DigitalOcean server you are provisioning.
This section should be run on the DigitalOcean droplet you are provisioning.
SSH into the DigitalOcean server you’ve created for provisioning purposes. All
files for background and sensitivity studies are located in
~/pre-er14-background
(for historical reasons). Navigate there and list the
contents:
$ cd ~/pre-er14-background
$ ls
bklist neutrinos signal-events
events scratchevents signal-neutrinos
LOCKDIR signal-doublet-events skymap-lists
logs signal-doublet-neutrinos skymaps
These directories contain inputs and outputs for the background and sensitivity studies. We are particularly interested in:
bklist
: Lists of neutrinos used for background calculationsskymaps
: A directory containing subdirectories filled with skymaps (since you might want to lump together a few runs worth of simulated skymaps, each in a separate directory, into a single background run) used by both background and sensitivity runs. For example, you might put a directory calledallsky
filled with.fits.gz
skymaps intoskymaps
, so that a typical input skymap called0.fits.gz
would be located at~/pre-er14-background/skymaps/allsky/0.fits.gz
.skymap-lists
: Lists of filepaths to skymaps inskymaps
that specify subsets of the full set of skymaps. These are used to distinguish between 2- and 3-detector cases; each list will have skymaps that correspond to a specific set of GW detectors that participated in the reconstruction of that skymap. When you start the analysis, you can specify one of these lists to only run on skymaps corresponding to that detector configuration. These skymap lists contain one full skymap path per line.You can make these skymap lists by downloading the
coincs.dat
file from Rainer’s directories onldas-pcdev1.ligo-la.caltech.edu
. The fastest way to get these and the skymaps is do download them from our DigitalOcean Spaces directories; simulated skymaps are located in subdirectories ofhttps://llama.nyc3.digitaloceanspaces.com/simulations/ligo-la/~kenneth.corley
with names likeBNS_O3_sample_skymaps.tar.gz
providing discriptive titles (you will also need the filename hash, so the easiest way to do this is to list files in that subdirectory through a command-line interface or the Spaces web-browser). Within the unzipped tarfile, you’ll be able to find thecoincs.dat
files in the specific simulation run you want to use; in the above example, that would be in theBNS_O3_1
subdirectory. For reference, on the LLO LDG servers (ldas-pcdev1.ligo-la.caltech.edu
),coincs.dat
would be in the/home/kenneth.corley/GWHEN_sample_skymaps/BNS_O3_sample_skymaps/BNS_O3_1
directory. Once you have coincs.dat, you can extract the skymap filename lists with something like:for dets in H1,L1 H1,L1,V1 H1,V1 L1,V1; do START="$dets" python -c 'import os; open("../skymap-lists/bns-o3/"+os.environ["START"].replace(",", "")+".csv", "w").write("\n".join(f"/home/vagrant/pre-er14-background/skymaps/{l.split()[1]}.fits.gz" for l in open("participating-detectors-bns-o3.txt").readlines() if l.split()[0] == os.environ["START"])+"\n")'; done
confirm that the total number of filenames matches the number of filenames split between the lists (though this list’s length should be one greater because it contains an initial header line):
wc -l ~/pre-er14-background//skymap-lists/bns-o3/* 2013 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv 5507 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1V1.csv 703 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1V1.csv 755 /home/vagrant/pre-er14-background//skymap-lists/bns-o3/L1V1.csv 8978 total wc -l ~/pre-er14-background/skymaps/coincs-bns-o3.dat 8979 coincs-bns-o3.dat
The file contents should look something like this:
$ head ~/pre-er14-background/skymap-lists/bns-o3/H1L1V1.csv /home/vagrant/pre-er14-background/skymaps/bns-o3/1.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/2.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/3.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/4.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/5.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/6.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/8.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/16.fits.gz /home/vagrant/pre-er14-background/skymaps/bns-o3/17.fits.gz
logs
: Contains logfiles for each run; background logfiles are just a timestamp (an integer) followed by.log
(with the largest number representing the latest timestamp, and hence the most-recently started run).events
: Output run directory containing one directory for each simulated background trigger (i.e. a GW skymap with a list of neutrinos plus all outputs of the analysis).
Running the Analysis in Parallel¶
Starting the Servers¶
See how many servers you currently have running with:
llama com do -D | wc -l
List all server names with:
llama com do -D
You should probably create a single server to begin. You can make sure that parallel operations work on this server before you save a DigitalOcean snapshot of it; create a single server using the command below and start the analysis for that single server to see if any bugs pop up. (If you haven’t run the analysis in parallel in a while, there might be bugs; if you’ve made major architectural changes since the last background run, there are almost certainly bugs.)
llama com do -c llama-parallel-bns-o3-00 -i llama-parallel-bns-o3
Once you’ve debugged things on this server and gotten the analysis working, you
can update the base image (llama-parallel-bns-o3
in this example) through
DigitalOcean’s web interface from the
working copy and spin up many more servers. Let’s say you’re limited to 100
servers and are already using a few, so you want to spin up 94 more instances
named llama-parallel-bns-03-00
through llama-parallel-bns-03-94
(assuming you’re keeping the image you created in the previous step, you only
need to create the remaining 94 images):
llama com do -c llama-parallel-bns-o3-{01..94} -i llama-parallel-bns-o3
llama dev dv
(and any vestigial pre-er-14-*
CLI utilities that have not
yet been ported to llama dev dv
) are automatically vectorized, so the
following commands should apply equally-well regardless of
whether you are running one server or 1,000. Outside of edge-case bugs, things
should run smoothly with many servers running once you’ve fixed things in the
single-server case.
Running the Analysis¶
If everything has been set up properly, you can start a background run. First, add all of the existing DigitalOcean LLAMA droplets to your SSH known hosts list with:
llama dev dv init
Confirm that no current LLAMA processes are running with (you should see a
total of 0
running processes at the end of the command):
llama dev dv ls-procs
Launch background runs for H1L1 (Hanford and Livingston) combined detectors with:
llama dev dv launch bg \
/home/vagrant/pre-er14-background//skymap-lists/bns-o3/H1L1.csv
Check to make sure processes are running:
llama dev dv ls-procs
See how many background run files are complete so far:
llama dev dv done bg
Print the most-recent lines of the most-recent logs from each server to see if anything interesting (read: bad) is happening (can also be useful to start debugging efforts)
llama dev dv tail
You might need to execute arbitrary code on one or more remote machines. Let’s
say you want to peak at the contents of the first background injection on
llama-bns-o3-00
. You can specify just that droplet and list the contents of
a specific directory with:
llama dev dv -p llama-bns-o3-00 eval \
'ls /home/vagrant/pre-er14-background/events/injection-0'
You can, of course, use this same method to delete stuff on remote machines,
move stuff, or what-have-you. You can use other llama dev dv
CLI flags to
control things with more granularity. When used with the eval
subcommand,
llama dev dv
is just a vectorized, multi-threaded (read: insanely fast)
version of ssh
that’s aware of your DigitalOcean naming semantics; use it
to make short work of remote commands that could otherwise be run using ssh
foo@your.server 'some command'
.
Now, assuming you’ve worked out any bugs and that things are flowing smoothly, sit back and wait until you’ve collected enough events for some level of statistical significance. Once you’ve amassed enough successful background runs, kill your LLAMA processes on all droplets with:
llama dev dv killall
Confirm all processes are dead (after a few moments):
llama dev dv ls-procs
Collecting Results¶
You should run until you have at least a few tens of thousands of events simulated. Check the progress of e.g. a background event simulation with:
llama dev dv done bg
Once you’ve simulated a sufficient number of events, you can pull the results down to your local computer. Anticipate around 1GB per 10,000 simulated events based on the pipeline architecture at time of writing. Pull results down and automatically organize them locally with:
llama dev dv pull
Next, create text tables with collated results from your simulations, making
sure to change the output filename to match the population you are working
with, by running the following command in the directory in which you just ran
llama dev dv pull
:
llama dev background table bns-o3-H1L1V1.txt
You will now have a table called bns-o3-H1L1V1.txt
(or something similar
based on the population you simulated). Look at the first 10 lines of this
file and make sure they are somewhat reasonable looking:
$ head bns-o3-H1L1V1.txt
EVENT-ODDS-RATIO N SOURCE-DIRECTORY SKYMAP-FILENAME
=========================================================================================================================
+2.2912652264e-36 9 llama-bns-o3-65/events/injection-323 /home/vagrant/pre-er14-background/skymaps/bns-o3/8937.fits.gz
+2.673378655e-25 4 llama-bns-o3-65/events/injection-111 /home/vagrant/pre-er14-background/skymaps/bns-o3/2304.fits.gz
+1.677508305e-15 8 llama-bns-o3-65/events/injection-129 /home/vagrant/pre-er14-background/skymaps/bns-o3/4798.fits.gz
+9.3603728371e-09 7 llama-bns-o3-65/events/injection-116 /home/vagrant/pre-er14-background/skymaps/bns-o3/3400.fits.gz
+9.233942133e-25 3 llama-bns-o3-65/events/injection-324 /home/vagrant/pre-er14-background/skymaps/bns-o3/1865.fits.gz
+1.565115763e-09 9 llama-bns-o3-65/events/injection-120 /home/vagrant/pre-er14-background/skymaps/bns-o3/699.fits.gz
+3.4337601072e-09 7 llama-bns-o3-65/events/injection-312 /home/vagrant/pre-er14-background/skymaps/bns-o3/8854.fits.gz
+9.4312345753e-09 4 llama-bns-o3-65/events/injection-118 /home/vagrant/pre-er14-background/skymaps/bns-o3/7545.fits.gz
It should look something like the above (the whitespace might be different;
ignore that). Most critically, you should see some
reasonable distribution of non-zero neutrino trigger list lengths (the N
column) and different odds ratio values. The source directories and skymap
filenames should also be different and reasonable looking.
Assuming this looks good, you can use these tables to generate the
populations.hdf5
test-statistic file using llama dev background pvalue
.
You can also zip up these simulation results and upload them to our
DigitalOcean Spaces storage using llama dev upload
.