Yoda uses delayed- and remote rule execution. There are different reasons to apply these constructs, and different ways to use these. Currently there is a bug in iRods whereby output parameters are not passed to delayed- or remote rules. This document explains where and how these constructs are applied.
The following concerns need to be dealt with:
- Security: we must protect the integrity/availability of data in the vault.
- Interactivity: the interactive user should not wait for tasks.
- Scalability: tasks with long execution time should not block other processes.
- Tracability: it should be tracable who was responsible for certain changes.
To address these concerns, we made the following decisions:
Decision: asynchronous execution of tasks with long- or uncertain execution time. Tasks that require long (say: more than 2 sec) or uncertain execution must be executed asynchronous from the interactive session.
Rationale: this allows the interactive session to continue (Concern 2), and these jobs to be scheduled/prioritized (Concern 3).
Impact: The interactive user is not informed of the result of the task, but that the task is pending. The user should be informed of the end-result via alternative channels.
Decision: exclusive privileges for vault. Tasks in the vault require exclusive privileges. These privileges may only be assigned to actors that can be trusted.
Rationale: Data in the vault must be protected from undesired changes that may damage the integrity/authenticity that is required for reuse of the data (Concern 1).
Impact: Privileges may not be assigned to an interactive user, as credentials may be compromised and to prevent human errors. Privileges must be limited to defined procedures that can be trusted to maintain integrity/authenticity. The system must register who ordered such a procedure (Concern 4).
We identified the following constructs to provide privileged and/or asynchronous execution:
- ExecCmd-of-irule for privileged execution.
- Delayed Rule for asynchronous execution. Currently this does not work in all cases, due to a bug with for delayed rules with output parameters in the top-level call.
- Cronjob for asynchronous AND privileged execution. This allows a job to be ‘picked up’ and executed within the system environment. Drawback: asynchronous, cronjobs can only be scheduled at minute-granularity, so certain jobs require ~1m to start. We use cronjobs as a fallback when we cannot use delayed rules due to a bug in irods.
The table below describes the ideal application of the above constructs. For now, the the cronjob is sometimes used for all async executions (bottom line).
|User Environment||System Environment|
|ASynchronous||Delayed Rule||Delayed Rule + ExecCmd|
- Discuss the motivation for using the system/iRods user to increase privileges. Should we ideally use a specific user/role with specific privileges for the vault?
- Discuss how the state model is reflected in the publication space.
- Discuss applying ‘pending’ states to the toVault as well, in line with the pending states for publication.
- Deletion of Data Package from the vault not yet implemented.
- Create state model for revisions and replications.
- Create state model for updating metadata.
The image below shows the state model, and indicates the actions that require asynchronous execution and/or privileges. The table below described the individual actions.
|1a||Research2Vault (request)||sync||user||Register Action and request execution|
|1b||Research2Vault (execution)||async||system||ACCEPTED - SECURED
New object: UNPUBLISHED
|2a||Vault2Research (request)||sync||user||Register action and request execution|
|2b||Vault2Research (execution)||async||user||New object: LOCKED||Copy files|
|3a||SubmitPublication (request)||sync||user||Register action and request execution|
|3b||SubmitPublication (execution)||sync||system||UNPUBLISHED - SUBMITTED_FOR_PUBLICATION|
|4a||ApprovePublication (request)||sync||user||Register action ande request execution|
|4b||ApprovePublication (execution)||sync||system||SUBMITTED_FOR_PUBLICATION - APPROVED_FOR_PUBLICATION||trigger Publish|
|5||Publish||async||system||APPROVED_FOR_PUBLICATION - PUBLISHED||Create/Register DOI, PMH, etc.
there is no registered within for public area?!
|6a||RejectPublication (request)||sync||user||Register action and request execution|
|6b||RejectPublication (execution)||sync||system||SUBMITTED_FOR_PUBLICATION - UNPUBLISHED||(is nog async, kan tzt sync worden)|
|7a||Published2pending (request)||sync||user||Register action and request execution|
|7b||Published2pending (execute)||sync||system||PUBLISHED - PENDING_DEPUBLICATION|
|7c||Pending2depublished||async||system||PENDING_DEPUBLICATION - DEPUBLISHED||Update/register DOI, PMH, etc.|
|8a||Republish2pending (requestt)||sync||user||Register action and request execution|
|8b||Republish2pending (execution)||sync||system||DEPUBLISHED - PENDING_REPUBLICATION||trigger pending2published|
|8c||Pending2published||async||system||PENDING_REPUBLICATION - PUBLISHED||Update/register DOI, PMH, etc.,|
|11a||Update Vault Metadata (request)||sync||user||Register action and request execution|
|11b||Update Vault Metadata (execution)||sync||system||Create metadata updates and trigger publication actions|
|11c||Update Vault Metadata (publishing)||async||system||Update/register DOI, PMH, etc.|
Note that all actions that require async/system execution are preceded by a synchronous user action that registers the action, and triggers the async/system action: 1, 2, 3, 4, , 6, 7, 8.
Actions are registered (in principle) in iCat, in provenance (user actions) and in system log. Registration in system log not always consequently.