llama.versioning module¶
Classes for versioning files using in a given directory. Currently implemented with git.
-
class
llama.versioning.
GitDirMixin
¶ Bases:
object
A mixin for
EventTuple
andFileHandlerTuple
subclasses that allows you to manipulate their event directories through agit
property returning aGitHandler
pointing to that property.-
static
decorate_checkin
(func)¶ If generation and check in succeeded, commit changes to event history.
-
static
decorate_checkout
(func)¶ Commit the state of the event before file generation attempt to the event’s history and proceed with checkout.
-
property
git
¶ Get a
GitHandler
for manipulating theeventdir
as a git repository. Used for versioning events.
-
static
-
class
llama.versioning.
GitHandler
¶ Bases:
llama.versioning.GitHandlerTuple
A class that performs
git
operations on aneventdir
.You can also call an instance as if it were a function to perform git commands conveniently; the interface is the same as
subprocess.Popen
withcwd
set to theGitHandler
’seventdir
(for convenience at the command line).- eventdirstr
The path to the directory that the new
GitHandler
instance will manipulate.
-
add
(*files)¶ Run
git add
for allfiles
. Raises aGitRepoUninitialized
exception if not a git repository.
-
commit_changes
(message)¶ git add
all files in theeventdir
and commit changes usingmessage
as the commit message. Raises aGitRepoUninitialized
exception if not a git repository. This will FAIL with aGenerationError
if there are no new changes.
-
copy_file
(filename, outpath, commit_hash=None, serial_version=None)¶ Check out a copy of a file, optionally specifying a particular version of the file from this event’s history, to the given outpath. If no version is specified with
commit_hash
orserial_version
, the latest version will be copied.- Parameters
filename (str) – The relative path to the file from
self.eventdir
(in most cases just the filename).outpath (str) – The path to the output file, or, if this path corresponds to an existing directory, the directory in which it should be saved (with the
os.path.basename
of``filename``). If the file exists, it will be overwritten without warning.commit_hash (str, optional) – The commit hash, or partial commit hash containing the starting characters of the full hash (as long as enough characters are provided to disambiguate hashes), of the version of
filename
that is to be copied tooutpath
. You can only specify one ofcommit_hash
orserial_version
.serial_version (int, optional) – The
serial_version
(i.e. the numbered version) offilename
to checkout. This is potentially more ambiguous than usingcommit_hash
. You can only specify one ofcommit_hash
orserial_version
.
- Returns
outfile – Path to the final output file.
- Return type
str
- Raises
GitRepoUninitialized – If the event directory is not a git directory.
ValueError – If both
commit_hash
andserial_version
are specified or if they do not correspond to available file versions.IOError – If the
outpath
cannot be written to.FileNotFoundError – If the file checkout fails.
-
property
current_hash
¶ Get the current git hash for this directory.
-
diff
(*args)¶ Return the
git diff
for the given file paths (from their last commits) as a string. Raises aGitRepoUninitialized
exception if not a git repository. This diff can be applied usinggit apply
.- Parameters
*args (str, optional) – File paths relative to the root of the git directory whose diffs should be taken. If no args are provided, the result will always be an empty string.
- Returns
diff – The exact text returned by
git diff ARG1 ARG2...
for the provided arguments. An empty string is returned if none of the file contents of the given paths have changed since the last commit OR if no paths are specified (note that this differs from standardgit diff
behavior, where ALL diffs from the last commit are provided if no arguments are specified).- Return type
str
-
property
eventid
¶ Parse an eventid from the eventdir by splitting off the basename.
-
filename_for_download
(filename, last_hash=None)¶ Get a filename that includes the
eventid
, revision number, and version hash forfilename
(i.e. what version number this is in the version history; e.g. if three versions of this file exist in the version history, then this is version 3). If thisfilename
does not appear in the git history, it will be marked ‘v0’ and the hash will be ‘UNVERSIONED’. The output format iseventid
, version, first 7 digits of commit hash, andfilename
, split by hyphens, so that the third version ofskymap_info.json
for eventS1234a
with git hashdedb33f
would be calledS1234a-v3-dedb33f-skymap_info.json
. Use this for file downloads or files sent to other services in order to facilitate data product tracking outside the highly-organized confines of a pipeline run directory.
-
hashes
(*filenames, pretty='', last_hash=None)¶ Get a list of full commit hashes for all commits related to the provided filenames. Returns an empty list if no filenames are provided or if the filename is not being tracked by git.
- Parameters
filenames (list) – Relative paths from the
eventdir
whose commits should be retrieved. Returns an empty list if no filenames are specified. To match all paths in the commit history, specify ‘–’ as the only filename.pretty (str, optional) – The git format string specifying what to return for each commit. By default, only returns the git hash for each commit pertaining to the given
filenames
.last_hash (str, optional) – If specified, only return hashes up to and including this hash; does not return hashes appearing topoligically later than this one. This can be a partial hash containing only the starting characters of the full hash (e.g. the first 7 characters, as is typically seen elsewhere) as long as enough characters are provided to disambiguate the available hashes.
- Returns
hashes – A list of git checksums for the commits related to the specified filenames (or some other per-commmit string whose contents are defined by
pretty
).- Return type
list
- Raises
GitRepoUninitialized – If not a git repository.
ValueError – If the command cannot be run with the given filenames in the given
eventdir
.ValueError – If the input
last_hash
is ambiguous (matches more than one hash) or if it matches no hashes.
-
init
()¶ Initialize the
eventdir
as a git repository.
-
is_ancestor
(possible_ancestor_hash, commit_hash)¶ Check whether
possible_ancestor_hash
is a topological ancestor ofcommit_hash
. Returns True if the hashes refer to the same commit. Raises aGitRepoUninitialized
exception if not a git repository. Useful for figuring out if one commit came after another (from a data flow perspective).- Returns
is_ancestor – True if
possible_ancestor_hash
is an ancestor ofcommit_hash
, False otherwise. NOTE that a value of False does not imply thatcommit_hash
is an ancestor ofpossible_ancestor_hash
(since they can be from different branches alltogether).- Return type
bool
-
is_clean
()¶ Return whether there are any changes made to the
eventdir
since the last commit. Raises aGitRepoUninitialized
exception if not a git repository.
-
is_repo
()¶ Checks whether this event directory is a git repo by seeing if it contains a
.git
subdirectory. Raises aGitRepoUninitialized
exception if not a git repository.
-
remove
(*files)¶ Run
git rm
for allfiles
. Raises aGitRepoUninitialized
exception if not a git repository.
-
reset_hard
(ref=None)¶ Hard reset the status of the branch to a given ref, losing all subsequent changes. If
ref
is not provided, reset to the last commit.
-
serial_version
(last_hash=None)¶ The serial version of this file as stored in the version history. Note that this is merely a count of how many prior versions of the file exist in this history; it is not an unambiguous label (in the same way that the hash value is). Use this for human and interpretation. If the file does not exist, this function returns
0
(unversioned), so it effectively starts at1
.
-
show_log
(ref='HEAD')¶ Show the git commit message and notes for the given
ref
.
-
text_graph
(*filenames, style='html')¶ Print a text graph of all files in the past history.
- Parameters
*filenames (str, optional) – An arbitrary list of filenames that will be spliced onto the end of the argument list for
git log
. Use this to narrow down the history shown. Use--
to specify all files in the past history of theHEAD
state.style (str, optional) – The format to put the output in. Options include ‘html’ (if this is going to go on a summary page).
-
exception
llama.versioning.
GitRepoUninitialized
¶ Bases:
ValueError
An exception indicating that a git repository has not been initialized.