Task -- a task scheduler for distributed computations
	-----------------------------------------------------

OVERVIEW
--------

A Task network needs two types of instances. One server and a set of
clients. The server dispatches commands, typically shell scripts, from
a queue maintained locally, and saves the resulting output locally.
The clients receive these commands, run them, and send the output back
to the server.

Clients connect to the server and communicate over TCP/IP. Clients can
connect and disconnect dynamically. As a result, commands may be
interrupted, in which case the server requeues them, until they are
successfully run and completed on a client. Commands too can be added
to the queue or removed from it at any moment, yet the output of
completed commands will not be removed from the server.

There is no buit-in security or authentication of any form. However,
by default the server is bound to the loopback interface and gets only
connections from localhost. Trust is achieved by running the server
on a controlled machine and establishing connections via ssh forwarding.

The server is controlled by special clients that send instructions and
return immediately upon acknowledgment or completion.

RUNNING THE SERVER
------------------

  task server [-h host] [-p port] [-A] [-L]

  -h host	Specify hostname to bind to (default: localhost)
  -p port	Specify port to bind to (default: 8000)
  -A		Bind to all the network addresses on this host
  -L		Bind to localhost (default)

RUNNING A CLIENT
----------------

  task client [-h host] [-p port] [-n processes] [-ws file|-wc command]

  -h host	Specify server hostname (default: localhost)
  -p port	Specify server port (default: 8000)
  -n processes	Specify the maximum number of tasks to be run simultaneously
  		by this client instance (default: number of online CPUs)

  -ws file	
  -wc command	Run 'command' (or the script 'file') and continuously read
  		from its output. Every time the character '0' is read, the
  		client is suspended and the tasks running on it are cancelled.
  		Every time '1' is read, the client signals the server that it
  		is back up and resumes normal operation.

CONTROL
-------

  task add [-h host] [-p port] [-o dir] {-s file|-c command} {-a file|name...}
  task rm [-h host] [-p port] [-o dir] [-f d|r|p...] [-a file|name...]
  task exec [-h host] [-p port] {-s file|-c command}
  task list [-h host] [-p port] [-l] [done] [running] [pending] [d|r|p...]
  task nodes [-h host] [-p port]

  -h host	Specify server hostname (default: localhost)
  -p port	Specify server port (default: 8000)
  -o dir	Specify the output directory on the server side
  
  -s file
  -c command	Specify a command, or a script file containing the command
  
  -a file
  name...	Specify a set of tasks by their name, or a file containing
  		the task names.

  -t		Display task durations
  -l		Long form (task start and stop times)

  -f d|r|p...	Select only tasks that have a given status:
  d|r|p
  done		- successfully completed
  running	- currently running on a client
  pending	- still in the queue


CONTROLLING TASKS
-----------------

  A single command may be used to run several tasks, by specifying several
  task names. Different tasks sharing a same command are differentiated
  by the fact that commands are prepended with the string
  	
  	task=<taskname>
  
  before being executed, where <taskname> is the name of the task.
  
  'task add' queues a series of tasks, all using a single specified command.
  
  'task rm' removes and stops tasks matching the provided server-side output
  directory, task status, and task name. If one parameter is not provided,
  any value for that parameter will be matched. Note that if no parameter
  is provided, for safety reasons, the help is displayed and nothing is
  performed. Use 'task rm -f drp' to remove all tasks.
  
  'task exec' queues one synchronizing command for every connected client.
  Each command is synchronizing in that
  (a) it is started only after all previous normal tasks are done or running
  (b) it is started on a client that is not executing any other task
  (c) no later task can be assigned to the same client until it has completed.
  It is useful, for example to schedule a code update or a recompilation,
  then immediately start queuing tasks for the updated code. Note that the
  control 'task exec' is blocking and will wait until the completion of
  all the tasks it schedules, displaying their output. You can detach it
  from the terminal using & to avoid blocking.
  
  'task list' shows the content of the server queues.

  'task nodes' shows a summary of the connected clients.
  
SSH EXAMPLE
-----------

On the server host, we start by starting the dispatcher:

	(task server -h localhost 2>task.log ) &

If we can connect to the clients from the server, we can make use
of the following script:

	hlist="client1.domain client2.domain client3.domain"

	for h in $hlist; do
	        ssh -R 8000:localhost:8000 user@$h \
	        	"mkdir /tmp/task; cd /tmp/task; task client" &
	done

Instead, if we can connect to the server from the client, then we can
run on each client:

	ssh -L 8000:localhost:8000 user@server &

then

	task client &

Note that here we started the server specifying "-h localhost", as the
server may otherwise listen only to IPv6 connections (localhost is usually
bound to the IPv4 address 127.0.0.1).

WATCH EXAMPLE
-------------

We can use -ws to interrupt the client whenever a given user (say, root)
needs the computer.

	while true; do
		sleep 1
		if [ -n "`who | grep root`" ]; then
			echo 0
		else
			echo 1
		fi
	done

This way, every time root logs in, the client on the same machine sees
it, cancels the running tasks, and stops being available for further
computations until root logs out.


COMPILATION
-----------

'make' is enough.


REQUIREMENTS
------------

libc with _BSD_SOURCE (for gethostname()) and _POSIX_SOURCE (for kill()
and getaddrinfo()) available.