VRE practical tips for responsible use
Use of the VRE
Surf Research cloud is a flexible system that gives the user a lot of autonomy. This means that a lot of responsibility lies with the user, which means the user has to be aware of possible consequences of his actions. The most probable risks are causing security vulnerabilities or data leaks, losing data, and losing credits (money) by inefficient use. This page handles these topics and provides practical tips to minimize these risks and improve the user experience.
Installation of software
On some workspaces the user gets admin privileges. This gives you the possibility to install additional software applications and libraries. Be aware that this can cause vulnerabilities and/or data leaks. To minimize this risk, make sure you are aware of the origin of the software that you would like to install and check other quality measures (known issues, number of users/downloads, citations, GitHub Stars, etc.). You are responsible for the software you install on the workspaces.
Pausing or Deleting workspaces
It is always recommended to pause or delete a workspace when you are not working on it and when it is not busy with a task. This is to minimize the use of resources (credits, energy, capacity). Although pausing a workspace is often an attractive option, it is only recommended for a period of a couple of days maximum. By deleting workspaces and creating a new workspace, you make sure you always have a machine with the most recent (security) updates. This can be done efficiently by:
- documenting or automating installation steps
- regularly saving data and scripts to a Persistent storage volume, e.g. Yoda or, Research Drive
- managing code on GitHub
The VRE support team is happy to help you with this.
Wallets and credits
Remember that you pay for workspaces and storage volumes with the wallet you receive from the UU or through a grant. To spend them wisely:
- Delete or pause resources that you are not using. Pause workspaces only if you plan to continue working with it on a short term, in other cases delete them as workspaces also consume credits for storage. Consider moving data to e.g. Yoda and deleting a storage volume when not using research cloud for more than a month.
- Only select large workspaces if you are sure that your tools or scripts make efficient use of the available compute power. If you are not sure, consider contacting the Research Engineering team for advice.
The UU provides a small starting budget of 10K credits for pilots or small projects, when this budget has been depleted we may provide a one time raise of your wallet of 10K credits if you can complete the project with this raise. Alternatively it is possible to submit a Small Compute application via SURF.
Cost calculator
Use the calculator below to estimate the costs for the workspaces and storage volumes that you create. The costs of a workspace are determined by the number of CPU cores or GPU devices plus a small amount for workspace storage (SSD). Remember that a workspace will consume credits whenever it is in “State: Running”. It will only stop consuming credits when you Pause or Delete a workspace in the Research Cloud portal. Use the expiry date to schedule deletion of your workspace as an extra safeguard. Storage volumes consume credits until you delete them in the Research Cloud portal.
Note: this describes costs for the standard use case, in which workspaces run on SURF’s HPC Cloud environment. Although ResearchCloud also allows you to run workspaces in different environment (such as Microsoft’s Azure cloud environment), the costs for this will be significantly higher.
#| standalone: true
#| viewerHeight: 600
from shiny import App, render, ui
import numpy as np
import matplotlib.pyplot as plt
import faicons
import shinyswatch
app_ui = ui.page_fluid(
ui.layout_sidebar(
ui.sidebar(
ui.input_radio_buttons(
"device", "Processing Unit", ["CPU", "GPU"], selected="CPU"
),
ui.output_ui("device_controls"),
),
ui.value_box(
title="Estimated costs workspace*",
showcase=faicons.icon_svg("circle-dollar-to-slot",width="50px"),
value=ui.output_ui("estimate"),
theme="bg-gradient-blue-purple",
),
ui.output_plot("plot_compute"),
),
theme = shinyswatch.theme.sandstone,
)
def server(input, output, session):
@output
@render.ui
def device_controls():
if input.device() == "CPU":
return ui.TagList(
ui.input_slider("cores", "Number of CPU cores", 1, 30, 4, step=1),
ui.input_slider("time", "Estimated time", 1, 50, 12, step=1),
ui.input_radio_buttons(
"time_unit", "Time Unit", ["hours", "days"]
),
)
else:
return ui.TagList(
ui.input_slider("gpus", "Number of GPU devices", 1, 4, 1, step=1),
ui.input_slider("time", "Estimated time", 1, 50, 12, step=1),
ui.input_radio_buttons(
"time_unit", "Time Unit", ["hours", "days"]
),
)
@render.plot(alt="Cost estimator")
def plot_compute():
ssd_hourly_cost = 1.525 * 100 / (30 * 24) # assumed on average 100GB SSD
if input.device() == "CPU":
x_max = 50.0
t = np.arange(0.0, x_max, 1.0)
if input.time_unit() == "days":
cost = t * 24 * (input.cores() + ssd_hourly_cost)
else:
cost = t * (input.cores() + ssd_hourly_cost)
else:
x_max = 20.0
t = np.arange(0.0, x_max, 1.0)
if input.time_unit() == "days":
cost = t * 24 * (input.gpus() * 21 + ssd_hourly_cost)
else:
cost = t * (input.gpus() * 21 + ssd_hourly_cost)
fig, ax = plt.subplots()
if input.time_unit() == "days":
ax.set_ylim([0, 1000 * 10])
else:
ax.set_ylim([0, 1000])
ax.set_xlim([0, int(x_max)])
ax.plot(t, cost, label="Credits")
ax.grid()
ax.vlines(input.time(), 0, cost[input.time()], colors='r', linestyles='dashed', label='Estimated time')
if input.device() == "CPU":
ax.hlines(y=cost[input.time()], xmin=0, xmax=input.time(), color='r', linestyle='dashed')
else:
ax.set_xticks(np.arange(0, int(x_max)+1, step=5))
ax.hlines(y=cost[input.time()], xmin=0, xmax=input.time(), color='r', linestyle='dashed')
ax.set(xlabel='Time in # of ' + input.time_unit(), ylabel='Credits',
title='Cost of running a workspace')
@render.text
def estimate():
ssd_hourly_cost = 1.525 * 100 / (30 * 24) # assumed on average 100GB SSD
if input.device() == "CPU":
if input.time_unit() == "days":
cost = input.time() * 24 * (input.cores() + ssd_hourly_cost)
else:
cost = input.time() * (input.cores() + ssd_hourly_cost)
else:
if input.time_unit() == "days":
cost = input.time() * 24 * (input.gpus() * 18 + ssd_hourly_cost)
else:
cost = input.time() * (input.gpus() * 18 + ssd_hourly_cost)
return f"{round(cost, 1)} credits"
app = App(app_ui, server)
*Costs including workspaces storage (SSD)
#| standalone: true
#| viewerHeight: 600
from shiny import App, render, ui
import numpy as np
import matplotlib.pyplot as plt
import faicons
import shinyswatch
app_ui = ui.page_fluid(
ui.layout_sidebar(
ui.sidebar(
ui.output_ui("device_controls"),
),
ui.value_box(
title="Costs persistent storage (HDD)",
showcase=faicons.icon_svg("circle-dollar-to-slot",width="50px"),
value=ui.output_ui("estimate"),
theme="bg-gradient-blue-purple",
),
ui.output_plot("plot_storage"),
),
theme = shinyswatch.theme.sandstone,
)
def server(input, output, session):
@output
@render.ui
def device_controls():
return ui.TagList(
ui.input_slider("time", "Estimated time", 1, 50, 12, step=1),
ui.input_radio_buttons(
"time_unit", "Time Unit", ["days", "months"]
),
ui.input_slider("storage", "Storage volume size in GB", 5, 1500, 250, step=5),
)
@render.plot(alt="Cost estimator")
def plot_storage():
x_max = 50.0
t = np.arange(0.0, x_max, 1.0)
if input.time_unit() == "days":
cost = t * input.storage() * 0.681 / 30
else:
cost = t * input.storage() * 0.681
fig, ax = plt.subplots()
if input.time_unit() == "days":
ax.set_ylim([0, 1000])
else:
ax.set_ylim([0, 1000 * 30])
ax.set_xlim([0, int(x_max)])
ax.plot(t, cost, label="Credits")
ax.grid()
ax.vlines(input.time(), 0, cost[input.time()], colors='r', linestyles='dashed', label='Estimated time')
ax.hlines(y=cost[input.time()], xmin=0, xmax=input.time(), color='r', linestyle='dashed')
if input.time_unit() == "days":
ax.set(xlabel='Time in # of days', ylabel='Credits',
title='Cost of persistent storage')
else:
ax.set(xlabel='Time in # of months', ylabel='Credits',
title='Cost of persistent storage')
@render.text
def estimate():
if input.time_unit() == "days":
cost = input.time() * input.storage() * 0.681 / 30
else:
cost = input.time() * input.storage() * 0.681
return f"{round(cost, 1)} credits"
app = App(app_ui, server)
VRE lifetime and security updates
The recommended maximum lifetime of a VRE (or workspace) is no more than 4 weeks.
A longer lifetime is only considered secure when the operating system and applications are regulary updated and the workspace is by one of the users, and the workspace is actively monitored.
Users are responsible for keeping the installed software up to date (patching) with regard to security updates. By limiting the lifetime of a VRE you can minimize the risk of vulnerabilities by outdated software, because everytime you create a new workspace, this will come with recent security updates already installed.
Ubuntu Linux workspaces on ResearchCloud currently have automatic operating system updates enabled by default. However, custom software that you may install (especially if they are server applications, like webapps) will still need to be patched and monitored. See here for more information. Of course you can contact us for help.
For longer-living workspaces, UU also requires use of scanning tools to scan active VRE’s for security vulnerabilities. If the scan gives cause, the UU will contact you with the request to update software on VRE, and/or to take additional measures. If you intend to run longer-running workspaces, please contact us and we can help to install the necessary scanning software.
Data storage and – backup
A VRE is not backed up. The user is responsible for backing up the (processed) data and/or configuration of the VRE. Make sure to regularly synchronize data and scripts with approved platform like Yoda, Research Drive and GitHub.
Sensitive data
VRE’s have been classified as suitable for data classified as “sensitive” within the UU data classification framework.
However, caution is always advised if you are planning to process sensitive data—especially personal data—on a VRE. While the workspace types supported by UU and SURF come with secure default configurations, changing these configurations can adversely impact security. Moreover, there are always some levels of risks involved with processing data in the cloud.
Depending on the precise nature of your data and its sensitivity, additional security precautions may therefore be advised.
During your intake meeting, we will discuss whether potentially sensitive data will be processed and can advise whether additional measures are required. If necessary, we will recommend further consulations with a data manager.
SANE
For cases where data is so sensitive that only a data manager or an external data owner should be able to up- and download it, SURF offers a special high security environment within ResearchCloud called SANE that needs to be activated on request. We can advise you on whether this is necessary and suitable for your project.
There is also a variant of SANE in which the researcher is required to interact with the data ‘blindly’.
Security incidents
If you believe there has been (suspected) misuse or unauthorized use of login details and / or VRE, please report this to CERT or send an email to its.ris@uu.nl and “pause” your VRE using the SURF Research Cloud portal.
Relevant policies
Privacy policies