OpenReproLab#
OpenReproLab is a support provided by IGE Calcul to M2 students. The aim is to pool the IGE’s best practices around computation, and to put into practice the ideas of Open science and Reproducible research. Students and supervisors are therefore invited to take part in this initiative to co-construct tomorrow’s research.
2025 session:#
Here are some of the features expected in this first round. Everything underneath is subject to potential change.
What we can provide#
a cloud computing interface based on local resources and services
several presentations, open to all but directed at the selected students, about Open Science and Reproducible Research, and the various tools and methods to put them into practice at IGE
support throughout the course: weekly Thursday afternoon slot for discussion and debugging
preservation of datasets and codes produced during internships.
For whom ?#
M2 students with a computationally-oriented subject and intermediate needs in terms of computing and storage resources, ex : data exploration/analysis or light software development.
10 students maximum
Benefices expected#
for students
technical support for computing
datascience good practices, IGE compatible
for supervisors
better reproducibility of results
long-term storing of datasets and codes
for the lab
potential PhD students trained in Open Science
co-construction of common practices (student - supervisor - platform - lab)
for science
increased visibility and accessibility of deliverables (data, codes, etc.)
Timeline#
January 21-22 : communication to CAPS and e-mail to all-IGE: call for candidates (=supervisors)
February 18: selection of candidates
every Thursday starting March 6 until July 31: training session on a tool or method + open slot for discussion and debugging 1pm-4pm
end of July: Repro-Hackathon and election of the most reproducible intership
Program of the training sessions#
Session 1 Basic tools (March 6th)
accessing the computing platform
accessing the storage
use git : create a repo for lab notebook
jupyter notebook
Session 2 Daily workflow (March 13th)
notebook : journal, project, specific software library
the target : a reproducible internship report
typical workflow
good practices : environment and enrolling the notebook before the push
Session 3 Documentation and conservation (March 20th)
for data : meta data, readme, conventions, markdown, zenodo
for software : docstring, reathedoc, software heritage
Session 4 Advanced software development (March 27th)
modules, library packaging
pip install
environments
The rules we set for ourselves#
about selection of internships
internships aligned with platform themes (computing, open science, etc.)
internships compatible with available computing resources (gpu, hpc, etc.)
interns from IGE Computing Platform members given priority
first-come, first-served rule
CoPil IGE Calcul can express a preference for distributing the increase in workload
no management of trainees, the supervisor remains responsible
reliable but not unbreakable technical infrastructure, trainees must have a plan B
Cloud computing interface#
Deployment of a JupyterHub/Lab associated with GRICAD resources mobilized by the NOVA service
For each user: 1 to 4 CPUs and 28 GB RAM guaranteed (expandable according to needs and timing), 50GB individual storage + NFS mounting of a 50 TB SUMMER space for medium-term back-up
Accessible from anywhere via an address such as open-repro-lab.osug.fr and authentication via GitHub/GitLab
Preloaded Pangeo-type computing environment + customization options
Persistent workdir and git workflow
Option to switch to GRICAD (more ressources) possible during the course of the internship
List of people for Support#
Name |
Building |
---|---|
Jennie |
OSUG/MCP |
Ian |
OSUG-B |
Alban |
OSUG-B |
Jordi |
MCP |
Amaury |
Glacio |
Mykael |
OSUG/MCP/Glacio |