The taskmanager is the core of RendView; it is basically an event-driven
state machine which takes several actions depending on arriving events
and the current state. (If you are looking for it: The code is in
taskmanager.cpp.)
The taskmanager is responsible for launching jobs, terminating RendView,
handling arriving signals, etc.
Basic operation
Normally, when RendView gets started, you pass all the information it
needs on the command line. It then runs until it has done all the work
you want it to be done or it thinks it cannot go on any more.
While RendView is running, you can send signals to
it.
When it has done all the work reported by the task source or it thinks
it cannot go on any more, RendView acts in different ways depending on
the operation mode. Normal RendView and the LDR serer exit in this case
while the LDR client just recovers which means that it disconnects
from the server and gets back into a state which is similar to the one just
after starting it: It simply waits idle for a new server connection.
When recovery is triggered (e.g. by a sudden server disconnect), all
the currently running jobs are killed (using SIGTERM and then
SIGKILL, see signals below and
-ld-term-kill-delay) and the
assigned tasks are deleted. This means that there is no way to benefit
from the work which was done for tasks which were not yet reported back
to the server. However, this case only occurs in severe circumstances,
e.g. if you SIGKILL (not SIGTERM) the server,
if the network goes down or if there is a bug in RendView.
Normally, the LDR client gives back tasks (even unfinished frames if it
killed the render process) and the LDR server waits for all clients to
disconnect before exiting. (There are timeouts, however, see
-Ld-rtimeout, -L-kamult.)
A word about LDR
You may have guessed that already: LDR works in the following way:
You have several computers to have the work done and you start
one LDR client on each of these boxes
(rendview -opmode=ldrclient, see also
-opmode). Then, whenever you want some frames to be
rendered, you simply call the LDR server
(rendview -opmode=ldrserver). The server
will then connect to all the clients (in fact you pass their address
and port using -Ld-clients) and give them the tasks.
This means that all the required files are downloaded to the client
(but see also -L-transfer, -l-r-files),
even unfinished frames (resume operation, only) and the success info
as well as the rendered/filtered frame are uploaded to the server
again. Additional files (-l-r-files,
-l-f-files) are normally only downloaded if needed
(i.e. either the client does not have that file or it was modified)
to save network bandwidth and CPU.
While the LDR server is running, it stays connected to all the
LDR clients (using one TCP connection per client). A client can only
be connected to no more than one authenticated server at any
time and here is no possibility for client disconnect during work.
(This is necessary because both the client and the server must
be able to send data to each other at any time while working.)
The path of the tasks
The taskmanager manages the internal path all the tasks take. It has
three task queues, todo, proc and done and the
number of tasks in these queues can be read in several verbose messages.
When a task is obtained from the task source (local: because taskmanager
asked the task source to supply one more task; LDR: because the server
sends one more task to the client), then that task is put into the todo
queue and waits there to be processed.
There are normally a couple of tasks in the todo queue so that the
taskmanager can server them for processing at any time without having to
wait for the task source. Avoiding times of inactivity (no renderer
running) was one of the major goals in early development. Especially
for LDR it is important that the LDR client has always some tasks around
so that we do not have to wait idle until a new task and all the required
files are downloaded over the network.
Do not be surprised if the task manager seems to always report or request
a couple of tasks in sequence and then does not do any request or
termination report (back to the task source) for a longer time. The todo
and done queues trigger the talking to the task source using a
threshold model (if you want to tweak around, see
-ld-todo-thresh-low, -ld-todo-thresh-high,
-ld-done-thresh-high, the same with -Ld instead of
-ld as well as -Ld-max-client-task-thresh).
Once the taskmanager launches a job, the task is put from todo into
proc queue where it stays until the job terminates or the LDR client
reports the task back (as done/partly/not processed). The task is then
put into done queue or back into todo queue: If the job failed or is
completely processed, it is put into done queue, if it is not completely
processed (i.e. rendered but not filtered) or not processed at all
(i.e. LDR client gives it back unprocessed), it is put into todo.
The done queue is explained very easily: it just accumulates some
tasks before they are reported back to the task source.
Hence, the tasks originate from the task source and they finally end up at
the task source. The task manager only knows about a couple of tasks
at a time which is enough it has to know.
Failures and info
You get quite a lot of information about what if being done (unless you
switch that off using the -verbose option). Needless to say
that you get informed about errors and failures.
There is a simple protection against RendView launching lots of failing
tasks in sequence (which may happen if the path to the renderer is wrong,
an additional file is missing or whatever): In case several (normally 3)
tasks fail in sequence, the task manager thinks it makes no sense to
continue work and schedules quit (or recovery in case of LDR client).
See -max-failed-in-seq if you want to tune that.
Fit strategy for LDR
The "fit strategy" means the way how to assign jobs to clients in order
to achieve good performance. This task is extremely difficult especially
if the used computers are not equally fast.
RendView implements the following simple stategy which should work
quite well (especially when using equally-fast clients with equal
capabilities):
As long as the task source did not report the last frame, there is no
pressure on the algorithm and it assigns tasks to clients in the
following way: Take the first task in the todo queue and give it to the
best client. If there is none, go to the next task in the queue,
and so on. The best client is the one which has the least number
of assigned tasks relative to the number of simultanious jobs (aka CPUs)
and which can process the task completely. If no client can process it
completely, a client which can to it partly is chosen. If two or more
clients are equally-qualified, a round robin method is used.
If the task source reports the last frame, the "tight condition" comes.
The best client is now the one with the least number of assigned
tasks relatve to its number of jobs, (nearly) irrespective if the client can
to the task completely or partly. Especially, no client gets more tasks
than it its njobs value (aka number of CPUs) and in case some clients have
more tasks while others need some, the clients are requested to give back
all tasks which they are currently not processing (for immediate
redistribution to clients which need tasks). [Such a give-back is normally
necessary only once.]
Note that this strategy may not work very well if the client's
abilities are assymetric, e.g. there are 10 render clients and one
render and filter client and that it will not smartly handle the case of
differently-fast computers (as it will never request a client to kill
some already running job).
An important note about render/filter shell scripts
Of course, you may use a shell script instead of a renderer or filter
program (by specifying the shell script as the binpath
in the correcponding render/filter desc, see
component data base). There are many reasons why one would like
to do so. However, there some important points you must keep in mind
when doing that.
Do not start background jobs. RendView's taskmanager needs one
process which gets started (the shell script in this case) and which
terminated just when the job is done. If you start a background job,
the shell script may terminate before the job is done leading to
corrupt frames and/or errors. (Well, you may start background jobs
but then make sure that the script waits for them to complete before
exiting.)
Handle signals properly. This is very important. RendView expects
a normal process which can be stopped using SIGTSTP
and continued using SIGCONT as well as killed using
SIGTERM. Make sure you react quickly to SIGTERM
or set the -ld-term-kill-delay large enough.
The signal handling is normally acomplished using the trap
call in shell scripts. Consult you shell manual for more information.
Use a proper return/exit code. RendView normally expects processes
to return 0 on success and sone non-zero value on failure.
(Unfortunately, POVRay is bugged in a way that it returns success even
if parsing failed and RendView's POVRay driver has some file existance
and time stamp logic to deal with that quite well.)
Signal handling
RendView understands the following signals with their corresponding
action:
- SIGINT (Terminal: often ^C)
-
Upon catching the first interrupt signal, the task manager will
give back all processed and not yet processed tasks to the task source.
The tasks currently running continue execution and RendView waits for
them to finish. (That means: Frames which are being rendered will also be
filtered before RendView exits.) When all running tasks exitet, RendView
will quit gracefully (i.e. disconnect from the task driver interface and
the task source).
For LDR: The LDR server will request the clients to give back all not yet
processed tasks and all done tasks when receiving that signal, so that
you only have to wait for tasks which are currently being processed by
an LDR client.
When catching the second SIGINT, the task manager kills
all the currently running processes (using SIGTERM and, if they
do not terminate within some time, see -ld-term-kill-delay,
then finally kills them using SIGKILL). Renderers can catch
the SIGTERM and terminate gracefully leaving an unfinished frame
which can be resumed lateron (see -l-cont and
-l-rcont). When all the jobs terminated (because they
were just killed), RendView exits cleanly.
When catching the third interrupt signal, RendView instantly
aborts execution (using abort(3) maybe dumping core). Do not
provoke that unless it is necessary.
- SIGTERM
-
Catching termination signal is exactly the same as catching two
SIGINTs. This means that if you are running RendView as batch
job and the computer shuts down sending all processes the TERM
signal, then RendView exits cleanly as fast as possible (normally leaving
unfinished frames which can be resumed, see SIGINTabove).
- SIGTSTP (Terminal: often ^Z)
-
Upon catching terminal stop signal, RendView can act in two ways:
If the task source is a local one (i.e. RendView and
LDRserver), RendView stops all currently running tasks
(using SIGTSTP) and then stops itself by sending a
SIGSTOP to itself. In case of an LDR server, a control command
is sent to the clients demanding to stop all processes (RendView in
mode "stopping". When a confirmation response was received from all
clients, the LDR server finally stops itself (SIGSTOP, mode now
"stopped").
This means that when pressing ^Z, RendView and all taska are
stopped and you get the shell prompt back.
When using the LDR task source (LDR client) things get more
complicated as seen above: After receiving the control command to stop
all processes, stopping all processes and sending confirmation to the
server, the client goes in "stop" mode which means that it will not
launch more jobs, will not talk to the task source and will disable the
server keepalive timeout (i.e. will not consider the connection to the
server to be broken after some time of inactivity). The client
does not stop itself because that would render it completely
useless (it could not continue upon request, see SIGCONT below).
Note that all other timeouts (including the client response timeout)
stay active. This means that in case you posed a timeout on a render
or filter job (-l-r-timeout or similar) or in case
a control command was not yet answered by the client, things are likely
to fail at the time you continue. [I will fix the client response timeout
in a future version if it turns out to be a problem. LDR works fine as
long as no non-answered client control command is pending.]
- SIGCONT (Terminal: often fg, bg)
-
When receiving SIGCONT, RendView will enter "continuing" mide
and send SIGCONT to all processes or send a continuation control
request to all LDR clients. When all processes are running again (i.e.
confirmation request from clients), it enters normal "running" mode again.
Note that RendView will do so even if it was not in "stopping" or "stopped"
mode which means that you can trigger continue jobs launched by RendView
which were stopped by some other means in that way. (It also means that
the routines to decide whether to give back/get new tasks and whether to
launch a task are re-examined, which may be interesting for bug
hunting.)
The LDR client basically un-does all the things it did when receiving
the stop control command.
Note that "continuing" and "running" mode are quite the same, RendView
will launch new jobs or talk to the task source in both modi.
In when "stopping" and when "stopped", these actions are not
taken.
- SIGUSR1
-
If you send a user 1 signal to RendView, the task manager will
dump the the state of all internal state variables to the terminal
(stderr). This is mainly useful in debugging (e.g. if RendView simply
does nothing but waiting or spins busily without good reason)
- SIGUSR2
-
Sending a SIGUSR2 to RendView will make the taskmanager dump
a complete list of all tasks in todo, proc and done list. Also mainly
useful in debugging but can also be used to see what is just being done.
NOTE: You will see nothing unless the TDR verbose
stream is enabled.
- SIGKILL, SIGSTOP
-
These are signals which cannot be handeled by a user process.
Consequently, RendView cannot deal with them gracefully. Always use
SIGTERM instead of SIGKILL and SIGTSTP
instead of SIGSTOP unless it is absolutely necessary.
Parameters for the taskmanager and driver interface
I've been talking a lot about the "taskmanager" above. As you know from
the quick start, this is a little
simplification. Because the taskmanager does not do all that alone but
uses a task driver interface (which can be of type "local" or "LDR").
The task driver interface is the virtualisation of the different
ways tasks can be launched (either locally or via LDR). Consequently,
they also take specific options/parameters.
Parameters for the taskmanager
The taskmanager itself does know very many parameters:
- -opmode=MODE
-
This switch selects the basic operation mode. Valid values for
MODE are rendview (default), ldrserver and
ldrclient.
Normal RendView mode selects the local task source and the local task
driver interface.
LDR server operation mode selects the local task source and the LDR
task driver interface
while LDR client uses the LDR task source and the local driver interface.
Instead of specifying -optmode, you can also rename (or
symlink) the RendView binary name. If RendView is called ldrserver
or ldrclient, the operation mode will default to
ldrserver or ldrclient respectively. If you call
it rendview or completely differently, it will default to normal
rendview opmode.
You can use -opmode to override the operation mode set by
the binary name.
- -daemon=[VAL]
-
When used, RendView will detach from the terminal and go into background
when starting to work. This is especially useful for LDR clients.
The following table lists all possibilities:
Argument |
Background |
Closed streams |
Alternative |
-daemon=no |
no |
(none) |
simply do not specify -daemon |
-daemon=yes |
yes |
stdin |
simply use -daemon |
-daemon=close |
yes |
stdin, stdout, stderr |
-daemon=noclose |
yes |
(none) |
You will normally want daemons to be quiet. The most convienent way
may be to use -daemon which closes stdin (especially
required for ssh connections) and then direct the output streams to
some log file:
./rendview -daemon [...] >log 2>&1
- -max-failed-in-seq=NUM (also: -mfis)
-
When at least this number of tasks failed in sequence (i.e. directly
following each other without successful tasks in between), RendView will
give up, do not start any more jobs and and schedule quit (i.e. wait
for all tasks to finish / clients to quit and quit (local) or
recover (LDR) then).
You may set a value of -1 to switch off this feature which is not
recommended.
The default value is 3.
- -etimeout=DATE
-
This sets a limit on how long RendView may run. This is useful if you
may use several boxes for rendering during the night but you have to
stop that at e.g. seven o'clock in the morning.
DATE can be specified using either an absolute or a relative
time:
Absolute time has the format "[DD.MM.[YYYY]] HH:MM[:SS]"
which means that if you want to stop at 19:00 today, you can use
-etimeout=19:00, for 19:00 on Mar 21st, use
-etimeout="21.3. 19:00".
Relative time has the format "now + {DD | [[HH:]MM]:SS}",
so if you want that RendView will not run longer than 7 hours, use
-etimeout=now+7:0:0, if you want to limit
execution to 7 days, use -etimeout=now+7 (without ":"),
for a limit of 30 minutes, use -etimeout=now+30:0.
For testing, you may set -l-nframes=0, then launch
RendView and check the line "Execution timeout:" in verbose
output.
See also -etimeout-sig
- -etimeout-sig=SPEC
-
When the execution timeout (as specified with -etimeout)
passed, RendView should stop working in some way. Using this option, you
can specify how RendView reacts. Possible values for SPEC are:
int: behave like catching one SIGINT.
term: behave like catching one SIGTERM.
abort: Immediately abort. Do not use if avoidable.
See above for RendView's reaction to signals.
- -cycitimeout=SEC
-
This is the run cycle idle timeout which only affects the active task
sources, i.e. the LDR client. If the client is idle (meaning
not connected to an authenticated LDR server) for more than
SEC seconds, then it will terminate (more precisely: behave like
catching one SIGINT).
This can be useful if you want clients to terminate automatically when
you do not give them jobs for some time.
- -load-max, -load-min=VAL
-
This is the "load control": When specified, RendView will not start jobs
when the load is greater or equal -load-max but instead
wait until the load is lower than -load-min again.
The value VAL is the desired load value multiplied
with 100 specified as an integer (i.e. 150 for load 1.5).
This option is probably not too useful. You cannot use it to regulate
the number of lauched jobs (try out if you do not believe me). However,
if you get told that your rendering may only start jobs if the machine
you are sharing with others has a load of below 1 (or so), then this
can be used.
See also -load-poll-msec
- -load-poll-msec=MSEC
-
If the load value is so high that no job may be started, RendView has to
check the load continuously to see when it is down again. The checking is
done in intervals of length MSEC milliseconds.
See also -load-max.
- -schedule-delay=MSEC
-
This is mainly useful in debugging. When re-scheduling is necessary,
instruct the taskmanager to not do that immediately but wait
MSEC milliseconds before scheduling.
Of course, this defaults to 0 and you should not use it.
It can be used as a crude fix in case the taskmanager spins idle
wasting CPU. But better report sich a case as bug to the author.
- -dumptask=SPEC
-
You get informed on the terminal via verbose output about what happens
to a task. Using this switch, you can specify when you want to get
informed. The syntax for SPEC is +/-VAL...
where VAL consists of one or several letters with the following
meaning:
a: dump info on task arrival (LDR)
q: dump info when task is being queued in todo queue
b: dump info when reporting task as done
d: dump info ??? task was and given back/destroyed
r: dump info when rendering is done
f: dump info when filtering is done
+Z: turn all info on
-z: turn all info off
Capital letters mean that you get long info (i.e. complete
task dump) while small letters only lead to a one-line short info.
Examples: -dumptask=+QDarf-d (default) or
-dumptask=-Rf+QD-r (where the -r will cancel the
previous +R). If unsure, try out to see the effect...
Parameters for the local task driver interface
The local task driver interface is used whenever a job (renderer or filter)
has to be executed on the local machine (thus in normal RendView and in
LDR client operation mode). It knows the following parameters.
- -ld-njobs=NUM
-
Specify the number of simultanious jobs to start. RendView will always try
to have NUM many processes running at a time; it may be less
but never more.
If RendView can detect the number of CPUs on in the computer, NUM
defaults to that value. Otherwise, the default is 1.
- -ld-term-kill-delay=MSEC
-
In case RendView decides that a job has to be killed, it will first send
it s SIGTERM. However, if the job does not terminate within
MSEC milliseconds, it will finally kill it using SIGKILL.
The default is 1000 msec (1 second).
- -ld-todo-thresh-low, -ld-todo-thresh-high, -ld-done-thresh-high=NUM
-
These are the todo and done queue thresholds.
The task driver will start requesting new tasks from the task source if
there are less than todo-thresh-low tasks in the todo
queue. This does not apply to the LDR client which gets tasks assigned
by the LDR server. The client cannot demand for new tasks; the
LDR server has to take care that the clients have enough tasks.
The taskmanager will never store more than todo-thresh-high
tasks in the todo queue, i.e. it will stop asking the task source for
more tasks when these many tasks are in the todo queue. This also
does not apply to the LDR client.
done-thresh-high is the number of tasks which have to
accumulate in the done queue before reporting (all the tasts in the
queue) back to the task source as "done". Use of 1 for the LDR
client is recommended but not mandatory. (It is safest if the LDR client
gives back info about successful frames as quickly as possible. High
values may prevent the LDR server to give new tasks to the LDR client
because it thinks there are already enough tasks assigned to the
client which were not yet reported back.)
Defaults should be reasonable.
- -ld-r-mute, -ld-r-quiet
-
Direct render output to /dev/null so that it does not clutter
your terminal. Using -ld-r-mute will only tie stdout
to /dev/null while -ld-r-quiet do it for
both stdout and stderr.
Default: Both switched off. Switching -ld-r-quiet on
is recommended.
- -ld-r-nice, -ld-f-nice=NVAL
-
Start render/filter processes with the specified nice value NVAL.
Values of 10 to 20 are probably good if other perople or processes also
want to run on the box.
See also -ld-r-nice-jitter below.
Default: No nice value.
- -ld-r-nice-jitter, -ld-f-nice-jitter
-
When used, vary nice values randomly by adding or subtracting 1
to prevent the render/filter processes from terminating simultaniously.
May not have the desired effect, though. Use
-no-ld-r-nice-jitter to switch off.
- -ld-r-jobs-max, -ld-f-jobs-max=NUM
-
Limit the number of simultanious render/filter processes, respectively.
Note that -ld-njobs is the overall limit which cannot
be exceeded. However, you may find that the filter run so fast that
it is sufficient to run one at a time which has the advantage that
file filtered frame files will be less fragmented on the hard drive.
There may also be other reasons for using this.
If you specify a limit of 0, then no rendering/filtering will be done.
This is unwise for normal RendView and LDR server operation because
the frames to be filtered will get stuck in todo queue and finally
nothing goes on any more.
On the LDR client side, you may use a limit of 0 to make sure that this
client does not get frames to be rendered. This works because the client
will then report no render/filter descs to the server.
Both values default to -ld-njobs.
- -ld-r-timeout, -ld-f-timeout=SEC
-
Specify a timeout in seconds for the render/filter process. The timeout
specifies the maximum time between launching the render/filter process
and its termination. In case the timeout is passed the normal
SIGTERM, SIGKILL sequence is sent to the process
(see -ld-term-kill-delay). Use a value of -1 to disable.
- -ld-r-detach-term, -ld-f-detach-term
-
If you disable these (using e.g. -no-ld-r-detach-term),
then you allow the terminal to keep control over the render process.
This is not recommended (because of SIGINT,
SIGTSTP signal handling).
Default: enabled
Parameters for the LDR task driver interface
The LDR task driver is used by the LDR server. It effectively handles
all the LDR server stuff, including all the network and transfer issues.
It understands the following options.
- -L-clients=CLIENTS
-
The most important option; it specifies a list of LDR clients to use.
The syntax is a space-separated list of client specs where each client
spec looks like one of "HOST", "HOST/PORT",
"HOST/PORT/PASSWORD",
"HOST//PASSWORD".
HOST is either an IPv4 address of the host the client is
running on, or a domain name which gets resolved via the standard
resolve library.
PORT is the TCP port the client listens to. It defaults to
the value specified with -Ld-port (see below).
PASSWORD is a password for this client. It defaults to the
value specified with -Ld-password (see below).
- -Ld-port=NUM
-
Specify the (default) LDR client TCP port.
The default LDR port is 3104.
- -Ld-password=PASS
-
Specify the (default) client password. See the LDR client description
in the tasksource section for more info
about the authentication.
Apart from a password string, you may use the following special
values:
none: no password (insecure). This is also the case if
you do not specify one.
prompt: prompt you for the password
(using getpass(3)).
file:PATH: read password from file PATH
(No more than 128 bytes will be read; falls back to prompt
if an error occurs or the file is empty.
Specifying the password on the command line is insecure; Using
prompt or file: is better, because it
will then not show up using ps(1) or top(1) and will
not be left in your shell history file. You may also consider passing
the password spec using the environment var RENDVIEWARGS
(see component data base) but don't pass
the literal password there because it may be possible to access the
environment as well (Linux users: have a closer look at /proc).
- -Ld-ctimeout=MSEC
-
The connection timeout in milliseconds; that is the maximum time allowed
to pass between initiating a connection to the client and completing
the authentication handshake.
The default is 15 seconds.
Note that this timeout as well as -Ld-rtimeout does
not have millisecond precision; values below 1000 (1 second) do not
make sense.
- -Ld-rcinterval=SEC
-
Re-connect interval. In case the LDR server could not connect to a client
or disconnected during operation for what reason ever, you may want that
it re-tries to connect from time to time. The rcinterval
option specifies this interval in seconds.
Note that due to internal scheduling, the actual interval time may
be up to twice as large (which is not really a problem).
A value of -1 switches off this feature.
Default is 5 minutes.
- -Ld-keepalive=SEC
-
Send the ping control command to all clients every SEC
seconds. This makes sure that they are still up and working because
there is a timeout on the response time to all control commands
(see -Ld-rtimeout below).
Use a value of -1 to switch that off which is not recommended because
the ping is used to detect unreachable clients (for what reason ever:
network failure, client computer reboot, etc.).
The default is 30 seconds.
Note that no keepalive ping requests are sent if the connection
is busy due to other operation (down/uploading tasks,...) because in these
cases the server knows that the client is still there.
- -Ld-rtimeout=MSEC
-
Maximum time it may take a client to respond to a control command (like
stop/cont/kill tasks, ping, disconnect). The client is considered dead
if the response does not arrive within MSEC milliseconds.
Use -1 to disable this feature which is not recommended.
The default is 5 seconds.
Note: You may need to increase this default when large files
are transferred. The reason is that the LDR client server connection
is one TCP connection used in in full duplex mode. That means the server
can send a control request to the client while the client uploads a file
to the server. However, the client cannot send the response before the
file is uploaded completely. If the whole -Ld-rtimeout
timeout passes while uploading the file, the client will get kicked
although it should not. There is no easy way to solve that because
we may not get trapped by stalled file uploads.
- -Ld-todo-thresh-low, -Ld-todo-thresh-righ, -Ld-done-thresh-high=NUM
-
This works just like the corresponding options of the local task
driver interface, see above.
- -Ld-max-jobs-per-client, -Ld-max-client-task-thresh=NUM
-
These are protective parameters. When the server connects to the client,
the client reports (to the server) its -ld-njobs
(see above) value (number of parallel tasks to start)
as well as the "high task threshold"
which is the number of tasks which the client would like to have assigned
at any time (which is higher than -ld-njobs so that the
client always has some tasks around to be able to quickly start new
jobs whenever some running jobs terminate -- without having to wait for
the LDR server to supply new tasks).
-Ld-max-jobs-per-client specifies the maximum
-ld-njobs value accepted from the clients. Higher values
will get decreased to -Ld-max-jobs-per-client.
-Ld-max-client-task-thresh is the limit for the
"high task thresh" reported by the client and hence limits the number
of jobs which are assigned to a client at any time.
Both features can be switched off using a value of -1 (if you trust
your clients).
The defaults are 24 and 36, respectively.
- -Ld-r-timeout, -Ld-f-timeout=SEC
-
This specifies the task driver's timeout for render/filter jobs.
This is normally not needed as you can set a timeout on the server side
using the (local) task source (-l-r-timeout) and on the
client side using the local task driver (-ld-r-timeout).
When you use it, the task sent to the client will contain the shortter
of the two timeouts (i.e. -l-r-timeout and
-Ld-r-timeout).
A value of -1 disables the timeout (default).
The task drivers
Task drivers actually launch the tasks. Currently, there is the
POVRay render task driver which supports several versions of
POVRay (at least 3.1g and 3.5) as
well as a generic filter driver supporting any filter which reads the
input image from stdin and writes it to stdout.
Unfortunately, POVRay is a bit bugged when it comes to it's exit
status. It returns 0 (success) even if parsing failed and no output
was actually generated. Hence, RendView applies some tricks to work
around this: It checks if the output file exists and also checks the time
stamp: if the modification time is older than the launch time of POVRay,
the file obviously did not get touched and rendering is considered as
failed even if POVRay returns "success".
You can read about "spurious success" in the output in such a case.
The filter driver also checks for the output frame existence but does
not apply any time stamp checks (because a filter may decide to actually
not touch an image for what reason ever).