IPython Notebook Integration

IPython/Jupyter provides a browser based interactive shell that supports data visualization. The storlets integration with IPython allows an easy deployment and invocation of storlets via an IPython notebook. In the below sections we describe how to setup IPython notebook to work with storlets, how to deploy a python storlet and how to invoke a storlet.

Set up IPython to work with storlets

Setting up an IPython notebook to work with storlets involves:

  1. Providing the authentication information of a storlet enabled Swift account. This is done by setting environment variables similar to those used by swift client. The exact variables that need to be set are dependent on the auth middleware used and the auth protocol version. For more details please refer to: python-swiftclient docs.

  2. Load the storlets IPython extension.

The below shows environment variables definitions that comply with the default storlets development environment installation (s2aio).

import os
os.environ['OS_AUTH_VERSION'] = '3'
os.environ['OS_AUTH_URL'] = 'http://127.0.0.1/v3'
os.environ['OS_USERNAME'] = 'tester'
os.environ['OS_PASSWORD'] = 'testing'
os.environ['OS_USER_DOMAIN_NAME'] = 'default'
os.environ['OS_PROJECT_DOMAIN_NAME'] = 'default'
os.environ['OS_PROJECT_NAME'] = 'test'

To load the storlets IPython extension simply enter and execute the below:

%load_ext storlets.tools.extensions.ipython

Deploy a Python storlet

General background on storlets deployment is found here.

In a new notebook cell, enter the ‘%%storletapp’ directive followed by the storlet name. Followng that type the storlet code. Below is an example of a simple ‘identitiy’ storlet. Executing the cell will deploy the storlet into Swift.

%%storletapp test.TestStorlet

class TestStorlet(object):
    def __init__(self, logger):
        self.logger = logger

    def __call__(self, in_files, out_files, params):
        """
        The function called for storlet invocation
        :param in_files: a list of StorletInputFile
        :param out_files: a list of StorletOutputFile
        :param params: a dict of request parameters
        """
        self.logger.debug('Returning metadata')
        metadata = in_files[0].get_metadata()
        for key in params.keys():
          metadata[key] = params[key]
        out_files[0].set_metadata(metadata)

        self.logger.debug('Start to return object data')
        content = ''
        while True:
            buf = in_files[0].read(16)
            if not buf:
                break
            content += buf
        self.logger.debug('Received %d bytes' % len(content))
        self.logger.debug('Writing back %d bytes' % len(content))
        out_files[0].write(content)
        self.logger.debug('Complete')
        in_files[0].close()
        out_files[0].close()

Note

To run the storlet on an actual data set, one can enter the following at the top of the cell

%%storletapp test.TestStorlet --with-invoke --input path:/<container>/<object> --print-result

N.B. Useful commands such as ‘dry-run’ is under development. And more details for options are in the next section.

Invoke a storlet

General information on storlet invocation can be found here.

Here is how an invocation works:

  1. Define an optional dictionay variable params that would hold the invocation parameters:

    myparams = {'color' : 'red'}
    
  2. To invoke test.TestStorlet on a get just type the following:

    %get --storlet test.py --input path:/<container>/<object>  -i myparams -o myresult
    

The invocation will execute test.py over the specified swift object with parameters read from myparams. The result is placed in myresults. The ‘-i’ argument is optional, however, if specified the supplied value must be a name of a defined dictionary variable. myresults is an instance of storlets.tools.extensions.ipython.Response. This class has the following members:

  1. status - An integer holding the Http response status

  2. headers - A dictionary holding the storlet invocation response headers

  3. iter_content - An iterator over the response body

  4. content - The content of the response body

  5. To invoke test.TestStorlet on a put just type the following:

    %put --storlet test.py --input <full path to local file> --output path:/<container>/<object>  -i myparams -o myresult
    

The invocation will execute test.py over the uploaded file specified with the –input option which must be a full local path. test.py is invoked with parameters read from myparams. The result is placed in myresults. The ‘-i’ argument is optional, however, if specified the supplied value must be a name of a defined variable. myresults is a dictionary with the following keys:

  1. status - An integer holding the Http response status

  2. headers - A dictionary holding the storlet invocation response headers

  3. To invoke test.TestStorlet on a copy just type the following:

    %copy --storlet test.py --input path:/<container>/<object> --output path:/<container>/<object>  -i myparams -o myresult
    

The invocation will execute test.py over the input object specified with the –input option. The execution result will be saved in the output object specified with the –output option. test.py is invoked with parameters read from myparams. The result is placed in myresults. The ‘-i’ argument is optional, however, if specified the supplied value must be a name of a defined variable. myresults is a dictionary with the following keys:

  1. status - An integer holding the Http response status

  2. headers - A dictionary holding the storlet invocation response headers

Extension docs

Implementation of magic funcs for interaction with the OpenStack Storlets.

This extension is desined to use os environment variables to set authentication and storage target host. (for now)

class storlets.tools.extensions.ipython.Response(status, headers, body_iter=None)

Bases: object

Response object to return the object to ipython cell

Parameters
  • status – int for status code

  • headers – a dict for repsonse headers

  • body_iter – an iterator object which takes the body content from

class storlets.tools.extensions.ipython.StorletMagics(**kwargs)

Bases: IPython.core.magic.Magics

Magics to interact with OpenStack Storlets

copy(line)
%copy [--input INPUT] [--output OUTPUT] [--storlet STORLET] [-i I]
          [-o O]
optional arguments:
--input INPUT

The input object for the storlet executionthis option must be of the form “path:<container>/<object>”

--output OUTPUT

The output object for the storlet executionthis option must be of the form “path:<container>/<object>”

--storlet STORLET

The storlet to execute over the input

-i I

A name of a variable defined in the environment holding a dictionary with the storlet invocation input parameters

-o O

A name of an output variable to hold the invocation result The output variable is a dictionary with the fields: status, headers, holding the response status and headers accordingly

get(line)
%get [--input INPUT] [--storlet STORLET] [-i I] [-o O]
optional arguments:
--input INPUT

The input object for the storlet executionthis option must be of the form “path:<container>/<object>”

--storlet STORLET

The storlet to execute over the input

-i I

A name of a variable defined in the environment holding a dictionary with the storlet invocation input parameters

-o O

A name of an output variable to hold the invocation result The output variable is a dictionary with the fields: status, headers, content_iter holding the response status, headers, and body iterator accordingly

put(line)
%put [--input INPUT] [--output OUTPUT] [--storlet STORLET] [-i I] [-o O]
optional arguments:
--input INPUT

The local input object for uploadthis option must be a full path of a local file

--output OUTPUT

The output object of the storlet executionthis option must be of the form “path:<container>/<object>”

--storlet STORLET

The storlet to execute over the input

-i I

A name of a variable defined in the environment holding a dictionary with the storlet invocation input parameters

-o O

A name of an output variable to hold the invocation result The output variable is a dictionary with the fields: status, headers, holding the response status and headers accordingly

storletapp(line, cell)
%storletapp [-c CONTAINER] [-d DEPENDENCIES] [--with-invoke]
                [--input INPUT] [--print-result]
                module_class
positional arguments:

module_class module and class name to upload

optional arguments:
-c CONTAINER, --container CONTAINER

Storlet container name, “storlet” in default

-d DEPENDENCIES, --dependencies DEPENDENCIES

Storlet container name, “storlet” in default

--with-invoke

An option to run storlet for testing. This requires –input option

--input INPUT

Specifiy input object path that must be of the form “path:/<container>/<object>”

--print-result

Print result objet to stdout. Note that this may be a largebinary depends on your app

uploadfile(line, cell)
%uploadfile container_obj

Upload the contents of the cell to OpenStack Swift.

positional arguments:

container_obj container/object path to upload