A dedicated page with documentation is available at https://jug.rftd.org. This page is just a summary.
If you use Jug to generate results for a scientific publication, please cite:
Coelho, L.P., (2017). Jug: Software for Parallel Reproducible Computation in Python. Journal of Open Research Software. 5(1), p.30.
It is a light-weight, Python only, distributed computing framework.
Jug allows you to write code that is broken up into tasks and run different tasks on different processors. You can also think of it as a lightweight map-reduce type of system, although it's a bit more flexible (and less scalable).
It has two storage backends: One uses the filesystem to communicate between processes and works correctly over NFS, so you can coordinate processes on different machines. The other uses a redis database and all it needs is for different processes to be able to communicate with a common redis server.
Jug is a pure Python implementation and should work on any platform. Python 3 is supported (at least 3.3 and greater).
Jug Documentation and Tutorial
Here is a one minute example. Save the following to a file called
primes.py
:
from jug import TaskGenerator
from time import sleep
@TaskGenerator
def is_prime(n):
sleep(1.)
for j in range(2,n-1):
if (n % j) == 0:
return False
return True
primes100 = map(is_prime, list(range(2,101)))
Of course, this is only for didactical purposes, normally you would use
a better method. Similarly, the sleep
function is so that it does not
run too fast.
Now type jug status primes.py
to get:
Task name Waiting Ready Finished Running
------------------------------------------------------------------------
primes.is_prime 0 99 0 0
........................................................................
Total: 0 99 0 0
This tells you that you have 99 tasks called primes.is_prime
ready to
run. So run jug execute primes.py &
. You can even run multiple
instances in the background (if you have multiple cores, for example).
After starting 4 instances and waiting a few seconds, you can check the
status again (with jug status primes.py
):
Task name Waiting Ready Finished Running
------------------------------------------------------------------------
primes.is_prime 0 63 32 4
........................................................................
Total: 0 63 32 4
Now you have 32 tasks finished, 4 running, and 63 still ready.
Eventually, they will all finish and you can inspect the results with
jug shell primes.py
. This will give you an ipython
shell. The
[primes100]{.title-ref} variable is available, but it is an ugly list of
[jug.Task]{.title-ref} objects. To get the actual value, you call the
[value]{.title-ref} function:
In [1]: primes100 = value(primes100)
In [2]: primes100[:10]
Out[2]: [True, True, False, True, False, True, False, False, False, True]
Here are the full API docs, which include several worked out examples. There is also a video (vimeo or showmedo), and a presentation.
Mailing List: https://groups.google.com/group/jug-users
PyPI for stable releases, github for the cutting edge. The code is licensed MIT.
You should be able to use pip
:
pip install jug
Copyright (c) 2009-2023. Luis Pedro Coelho. All rights reserved.