Installation

Requirements

The MULTIPLY Data Access Component has been developed against Python 3.6. It cannot be guaranteed to work with previous Python versions, so we suggest using 3.6 or higher. The DAC will attempt downloading data from remote sources. We therefore recommend to run it on a computer which has a lot of storage (solid state disks are recommended) and also a good internet connection.

Installing from source

To install the Data Access Component, you need to clone the latest version of the MULTIPLY code from GitHub and step into the checked out directory:

To install the MULTIPLY Data Access into an existing Python environment just for the current user, use:

  • python setup.py install –user

To install the MULTIPLY Data Access for development and for the current user, use

  • python setup.py develop –user

Configuration

There are a few configuration options you can make to use the DAC. These options will be available after you use the DAC for the first time. To use it, type in a python console:

$ from multiply_data_access import DataAccessComponent
$ dac = DataAccessComponent()

When you execute this for the first time, in your home directory a folder .multiply is created, in which you will find a file called data_stores.yml , which we will refer to as the data stores file in the following. This file contains the data stores to which the DAC has access. In the beginning, it will consist of several default entries for data stores which are required for accessing remote data (For an explanation of the concepts of a FileSystem and a MetaInfoProvider go to function). These entries have settings that look like the following:

- DataStore:
    FileSystem:
        parameters:
            path: /path/to/user_home/.multiply/aws_s2/
            pattern: /dt/yy/mm/dd/
            temp_dir: /path/to/user_home/.multiply/aws_s2/temp/
        type: AwsS2FileSystem
    Id: aws_s2
    MetaInfoProvider:
        parameters:
            path_to_json_file: /path/to/user_home/.multiply/aws_s2/aws_s2_store.json
            type: AwsS2MetaInfoProvider

Consider especially the parameters path and pattern of the FileSystem. These parameters determine where downloaded data will be saved. path determines the root path, pattern determines a pattern for adding an additional relative graph. dt stands here for the data type, yy for the year, mm for the month, and dd for the day of the month. So, if you download S2 L1C data in the AWS format for the 26th of April, 2018, using the above configuration it would be saved to /path/to/user_home/.multiply/aws_s2/aws_s2_l1c/2018/4/26/. Feel free to change these parameters so the data is stored where you want it. If you point it to a folder that already contains data, make sure it conforms to the pattern so it will be detected. If you want to add a new data store using your already locally stored data, go to User Guide.

Some of the data stores require authentication. Here we will describe how to set this up the access to Sentinel-2 data from Amazon Web Services (AWS, https://registry.opendata.aws/sentinel-2/ ) and to MODIS data from the Land Processes Distributed Active Archive Center (LP DAAC, https://lpdaac.usgs.gov ) .

Configuring Access to MODIS Data from the LP DAAC

To access the data, you need an Earthdata Login. If you do not have such a login, click here1_ to register. .. _here1: https://urs.earthdata.nasa.gov/home Registration and Data Access are free of charge. When you have the Login data, open the data stores file and search for the Data Store with the Id MODIS Data. You will find two entries username and password. Enter there your Earthdata username and password. The entry should then look something like this:

- DataStore:
    FileSystem:
        type: LpDaacFileSystem
        parameters:
        temp_dir: /path/to/user_home/.multiply/modis/
        username: earthdata_login_user_name
        password: earthdata_login_password
        path: /path/to/data/modis/
        pattern: /dt/yy/mm/dd/
    Id: MODIS Data
    MetaInfoProvider:
        type: LpDaacMetaInfoProvider
        parameters:
        path_to_json_file: /path/to/user_home/.multiply/modis/modis_store.json

Then simply save the file.

Configuring Access to Sentinel-2 Data from Amazon Web Services

First, you can enable it to download Sentinel-2 data from Amazon Web Services. Please note that unlike the other forms of data access, this one eventually costs money. The charge is small, though. (see here2_). .. _here2: https://forum.sentinel-hub.com/t/changes-of-the-access-rights-to-l1c-bucket-at-aws-public-datasets-requester-pays/172 To enable access, go to https://aws.amazon.com/free/ and sign up for a free account. You can then log on to the `Amazon Console`__. __ aws_console_ .. _aws_console: https://console.aws.amazon.com/console/home From the menu items Services->Security, Identity and Compliance choose ÌAM. There, under Users, you can add a new user. Choose a user name and make sure the check box for Programmatic Access is checked. .. figure:: _static/figures/aws_add_user.png

scale:50%
align:center

On the next page you need to set the permissions for the user. Choose Attach existing policies directly and check the boxes for AmazonEC2FullAccess and AmazonS3FullAccess (later you may simply choose to copy the permissions from an existing user). .. figure:: _static/figures/aws_add_user_permissions.png

scale:50%
align:center

When everything is correct, you can create the user. On the next site you will be shown the access key id and a secret access key. You can also download both in form of a .csv-file.

Next you will need to install the sentinelhub python package. Follow the instructions from `this site`__ to do so. __ sentinelhub_ .. _sentinelhub: https://sentinelhub-py.readthedocs.io/en/latest/install.html Then proceed to configure sentinelhub using your AWS credentials, following the instructions from `this site`__. __ sentinelhub_configuration_ .. _sentinelhub_configuration: https://sentinelhub-py.readthedocs.io/en/latest/configure.html

The MULTIPLY Data Access Component will then be able to access this data.

(one can argue that maybe to put this higher up in the structure, but considering that this is only done once, I thought it better to put it lower in the documentation).