From d767c42a268d9963eac967307492bac5e6cf79da Mon Sep 17 00:00:00 2001 From: Anne Fouilloux Date: Fri, 25 Oct 2024 10:35:47 +0200 Subject: [PATCH 1/2] add info on how to create new user group and S3 buckets --- docs/admin_hub.md | 89 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 81 insertions(+), 8 deletions(-) diff --git a/docs/admin_hub.md b/docs/admin_hub.md index 1722aa6..8407c02 100644 --- a/docs/admin_hub.md +++ b/docs/admin_hub.md @@ -18,11 +18,14 @@ The user image is defined in the [gfts-track-reconstruction/jupyterhub/images/us ## S3 Buckets -Limited storage is available on the GFTS hub itself but we have setup different S3 buckets where we manage the data we need for the GFTS project. We currently have 3 different S3 buckets: +Limited storage is available on the GFTS hub itself but we have setup different S3 buckets where we manage the data we need for the GFTS project. We currently have different S3 buckets which we can split into 2 categories: -- "**gfts-reference-data**": This S3 bucket contains reference datasets (such as Copernicus Marine) that have been copied here to speed up access and data processing. Most GFTS users will only require read-only access to this bucket; a few GFTS users can write to it to copy new reference datasets. If you need additional reference datasets, please create an issue [here](https://github.com/destination-earth/DestinE_ESA_GFTS/issues/new). -- "**destine-gfts-data-lake**": This S3 bucket contains datasets generated by the GFTS, which are intended to be made public for all users with access to the GFTS hub; -- "**gfts-ifremer**": This S3 bucket contains datasets that are private to a specific group, in this case, private to IFREMER GFTS users. Users with access to this S3 bucket can read and write to it. +1. Public S3 buckets: these buckets enable to share data to all users of the platform. + - "**gfts-reference-data**": This S3 bucket contains reference datasets (such as Copernicus Marine) that have been copied here to speed up access and data processing. Most GFTS users will only require read-only access to this bucket; a few GFTS users can write to it to copy new reference datasets. If you need additional reference datasets, please create an issue [here](https://github.com/destination-earth/DestinE_ESA_GFTS/issues/new). + - "**destine-gfts-data-lake**": This S3 bucket contains datasets generated by the GFTS, which are intended to be made public for all users with access to the GFTS hub; +2. Private S3 buckets: these buckets contain datasets that are private to a specific group. All users of a given group have **read** and **write** access to their corresponding buckets. Users can store private datasets or save intermediate results that will be shared in **destine-gfts-data-lake** once validated. + - "**gfts-ifremer**": This S3 bucket contains datasets that are private to IFREMER GFTS users. + - "**gfts-vliz**": This S3 bucket contains datasets that are private to VLIZ GFTS users. ## Access to the GFTS Hub and S3 Buckets @@ -32,15 +35,15 @@ The first step is to create an [issue](https://github.com/destination-earth/Dest 1. The GitHub username of the person you want to add to the GFTS Hub; 2. The list of S3 buckets this new person would need to access. -3. If a new group of users is required, please specify the name of the new S3 bucket to be created for this group and identify any existing users who need access to it. **A new group of users is only necessary if you have your own set of biologging data that must remain private and not be shared with everyone**. +3. If a new group of users is required, please specify the name of the new S3 private bucket to be created for this group and identify any existing users who need access to it. **A new group of users is necessary if you have a unique set of biologging data that must remain private and cannot be shared publicly, or if you need to share intermediate and non-validated results within a specific group before making them available to the GFTS community**. :::{seealso} -The current list of authorized GFTS users can be found in [`gfts-track-reconstruction/jupyterhub/gfts-hub/values.yaml`](https://github.com/destination-earth/DestinE_ESA_GFTS/blob/12fa92d1a1e6f6f089a7bc8dbc26c8ed3f101b73/gfts-track-reconstruction/jupyterhub/gfts-hub/values.yaml#L150). +The current list of authorized GFTS users can be found in [`gfts-track-reconstruction/jupyterhub/gfts-hub/values.yaml`](https://github.com/destination-earth/DestinE_ESA_GFTS/blob/main/gfts-track-reconstruction/jupyterhub/gfts-hub/values.yaml#L169). ::: -### Giving access to the GFTS Hub and S3 Buckets (Admin only) +### Giving access to the GFTS Hub and existing S3 Buckets (Admin only) Everyone can initiate a Pull Request to add a new user with read-only access to `gfts-reference-data` and `destine-gfts-data-lake`. There is only one step: @@ -62,14 +65,84 @@ adding the following steps which can only be done by a GFTS Hub admin: - `s3_ifremer_developers`: write access to `gfts-ifremer` and `gfts-reference-data` - `s3_ifremer_users`: write access to `gfts-ifremer` only + - `s3_vliz_users`: write access to `gfts-vliz` only - `s3_admins`: admin access to all s3 buckets + If you need to create a new user group and a private S3 bucket for them, please read the next section on creating a new user group before proceeding with steps 3–6. + 3. Run `tofu apply` to apply the S3 permissions. Ensure you are in the `gfts-track-reconstruction/jupyterhub/tofu` folder before executing the `tofu` command and have run `source secrets/ovh-creds.sh`. 4. Update `gfts-track-reconstruction/jupyterhub/secrets/config.yaml` with the output of the command `tofu output -json s3_credentials_json`. This command needs to be executed in the `tofu` folder after applying the S3 permissions with `tofu apply`. If the file contains binary content, it means you do not have the rights to add new users to the GFTS S3 buckets and will need to ask a GFTS admin for assistance. -5. Don't forget to commit and push your changes! +5. Run `pytest` in the `tofu` directory to test s3 permissions. +6. Don't forget to commit and push your changes! Steps 3 and 4 are what actually grant the jupyterhub user s3 access. +### Creating a new group of users (Admin only) + +If you need to create a new user group with a corresponding private S3 bucket, follow the additional step below (to be completed after step 1 and before step 2). + +Choose a new group name (not too long e.g. < 8 characters) which can be the organisation name of the user(s) or its acronym. We suggest to add the prefix `gfts-` (e.g. gfts-ifremer, gfts-vliz, etc.). In the example below, we are adding a new group of users called `gfts-vliz`: + +- Add the new bucket name in the `s3_buckets` variable in `gfts-track-reconstruction/jupyterhub/tofu/main.tf`: + +``` + s3_buckets = toset([ + "gfts-vliz", + "gfts-ifremer", + "gfts-reference-data", + "destine-gfts-data-lake", + ]) +``` + +- Create a new variable to list the users who will have access to the new S3 bucket. Locate the variable `s3_ifremer_users` and add the new variable immediately after it: + +``` + s3_vliz_users = toset([ + "davidcasalsvliz", + ]) +``` + +- Update the `s3_users` variable by adding the new list of users (here `local.s3_vliz_users`): + +``` +s3_users = setunion(local.s3_readonly_users, local.s3_admins, local.s3_vliz_users, local.s3_ifremer_developers, local.s3_ifremer_users) +``` + +- Create a new resource policy for this new group of users (search for `resource "ovh_cloud_project_user_s3_policy" "s3_ifremer_users"` to locate the section on resource policy for users): + +``` +resource "ovh_cloud_project_user_s3_policy" "s3_vliz_users" { + for_each = local.s3_vliz_users + service_name = local.service_name + user_id = ovh_cloud_project_user.s3_users[each.key].id + policy = jsonencode({ + "Statement" : concat([ + { + "Sid" : "Admin", + "Effect" : "Allow", + "Action" : local.s3_admin_action, + "Resource" : [ + "arn:aws:s3:::${aws_s3_bucket.gfts-vliz.id}", + "arn:aws:s3:::${aws_s3_bucket.gfts-vliz.id}/*", + ] + }, + ], local.s3_default_policy) + }) +} +``` + +Make sure your replace `vliz` with the new group name! + +- Create the new S3 bucket by locating resource `"aws_s3_bucket" "gfts-ifremer"` and adding the new bucket configuration immediately after it: + +``` +resource "aws_s3_bucket" "gfts-vliz" { + bucket = "gfts-vliz" +} +``` + +- You are done with the configuration of the new group and its corresponding private S3 bucket. Go back to the previous section on [giving access to the GFTS Hub and S3 buckets](https://destination-earth.github.io/DestinE_ESA_GFTS/admin_hub.html#giving-access-to-the-gfts-hub-and-s3-buckets-admin-only) and follow the steps 2-6. + :::{caution} The following packages need to be installed on your system: From 3d0faaf1956a575d678c5c5deb484725eca8a3dd Mon Sep 17 00:00:00 2001 From: Anne Fouilloux Date: Fri, 25 Oct 2024 10:36:59 +0200 Subject: [PATCH 2/2] pre-commit fix --- docs/admin_hub.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/admin_hub.md b/docs/admin_hub.md index 8407c02..c27c551 100644 --- a/docs/admin_hub.md +++ b/docs/admin_hub.md @@ -24,8 +24,8 @@ Limited storage is available on the GFTS hub itself but we have setup different - "**gfts-reference-data**": This S3 bucket contains reference datasets (such as Copernicus Marine) that have been copied here to speed up access and data processing. Most GFTS users will only require read-only access to this bucket; a few GFTS users can write to it to copy new reference datasets. If you need additional reference datasets, please create an issue [here](https://github.com/destination-earth/DestinE_ESA_GFTS/issues/new). - "**destine-gfts-data-lake**": This S3 bucket contains datasets generated by the GFTS, which are intended to be made public for all users with access to the GFTS hub; 2. Private S3 buckets: these buckets contain datasets that are private to a specific group. All users of a given group have **read** and **write** access to their corresponding buckets. Users can store private datasets or save intermediate results that will be shared in **destine-gfts-data-lake** once validated. - - "**gfts-ifremer**": This S3 bucket contains datasets that are private to IFREMER GFTS users. - - "**gfts-vliz**": This S3 bucket contains datasets that are private to VLIZ GFTS users. + - "**gfts-ifremer**": This S3 bucket contains datasets that are private to IFREMER GFTS users. + - "**gfts-vliz**": This S3 bucket contains datasets that are private to VLIZ GFTS users. ## Access to the GFTS Hub and S3 Buckets @@ -141,7 +141,7 @@ resource "aws_s3_bucket" "gfts-vliz" { } ``` -- You are done with the configuration of the new group and its corresponding private S3 bucket. Go back to the previous section on [giving access to the GFTS Hub and S3 buckets](https://destination-earth.github.io/DestinE_ESA_GFTS/admin_hub.html#giving-access-to-the-gfts-hub-and-s3-buckets-admin-only) and follow the steps 2-6. +- You are done with the configuration of the new group and its corresponding private S3 bucket. Go back to the previous section on [giving access to the GFTS Hub and S3 buckets](https://destination-earth.github.io/DestinE_ESA_GFTS/admin_hub.html#giving-access-to-the-gfts-hub-and-s3-buckets-admin-only) and follow the steps 2-6. :::{caution}