pyBIS and apiV3: how to add and use a spreadsheet

Hello,
I am currently exploring the pyBIS api, and would really like to use the spreadsheet property. But I couldn’t find how to upload some data, or decode an existing one (that looks like eyJoZWFkZXJzIjpbIkEiLCJCIiwiQyIsIkQiLCJF…).
I tried to add some data on my own (like pandas csv to xml), but that only break the entity (it does not shows correctly in the ELN UI), without returning any error.
I also couldn’t find any information on how to do that using the API V3 (that I used before).
Thank you,
Alain

1 Like

Hi Alain.
Thank you for the question.
One of our colleagues working with the pyBIS will get back to you with the answers.
Just as reminded.
When having some specific, openBIS-related questions, you can drop an email to the openBIS dedicated help desk: openbis-support@id.ethz.ch

Regards.
Ross

Hi Alain,

you can try the following:

import json
import base64
from pandas import DataFrame

# [6:-7] is just a quick and dirty way to get what's inside <DATA> and </DATA>
spreadsheet = json.loads(base64.b64decode(sample.props["general_protocol.spreadsheet"][6:-7]))
df = DataFrame(spreadsheet["data"], columns=spreadsheet["headers"])

To store the (modified) data back to openBIS, you need to export the DataFrame back into the same format:

spreadsheet["data"] = df.to_numpy().tolist()
sample.props["general_protocol.spreadsheet"] = "<DATA>" + str(base64.b64encode(bytes(json.dumps(spreadsheet), encoding='utf-8')), encoding='utf-8') + "</DATA>"

Hope that helps!
Cheers, Swen

1 Like

Hi Swen,
Thank you for the answer, it really helps. I could make the reading but not yet the writing yet (it upload something but the resulting sample in OpenBis does not show). But that could well be something wrong I am doing (reading from a tsv using pandas as well). I will work on it more later.
Thanks again,
Alain

Hey @abecam ,

did you save your sample after changing it?

sample.props["general_protocol.spreadsheet"] = "<DATA>" + str(base64.b64encode(bytes(json.dumps(spreadsheet), encoding='utf-8')), encoding='utf-8') + "</DATA>"
sample.save()

Julia

Thank you @Swen for posting this!

I´m also using this as I want to show some results from another script in a nice table.
However, I can only change an already edited spreadsheet, not an unused one (which I need as it shouldn´t be available for editing by users).

Any ideas how to do this?

Hi @JuliaD,
fetching the spreadsheet is done in three steps:

  • get the general_protocol.spreadsheet property and strip away <DATA> and </DATA>:
  • sample.props["general_protocol.spreadsheet"][6:-7]
  • base64 decode the remaining part
  • transform the json string into Python data structure with json.loads

Creating a spreadsheet is basically just the other way round.

  • create a Python dictionary (eg. spreadsheet) with the keys headers and data (I guess the headers 'style', 'meta', 'width'are not needed, but I am not sure)
  • transform the Python dictionary into a JSON string with json.dumps
  • transform the JSON string into bytes (JSON is always utf-8)
  • base64 encode the bytes
  • add <DATA> and </DATA> at the beginning and end
  • store everything into the general_protocol.spreadsheet property
  • save the sample

I guess the trickiest part is to get the spreadsheet data structure right. Create a sample spreadsheet in the ELN UI, decode it as described above and then try to create your own, using the exact same structure.

Cheers and good luck!

Swen

1 Like

Thanks to your explanations I was able to do this :slight_smile:

I tried it out and it works with ‘headers’ and ‘data’ only. ‘style’, ‘meta’ and ‘width’ are not needed.

Note:
The size of the given spreadsheet doesn´t matter. The openBIS spreadsheet in the User Client will always show columns A to AD and rows 1 to 30, even if your given structure contains only 5 columns and 2 rows.
E.g. mine was:

spreadsheet = {'headers': ['Name', 'ID', 'SizeX', 'SizeY', 'SizeZ'],
               'data': [['99.tif', '997777', '9', '999', '9999'],
                        ['55.tif', '5', '555', '5', '55']]}

Hello Everyone,
here’s an example for how we used the spreadsheet for dynamic properties:

import base64
import re
import collections
import json


def json_load_byteified(file_handle):
    return _byteify(
        json.load(file_handle, object_hook=_byteify),
        ignore_dicts=True
    )

def json_loads_byteified(json_text):
    return _byteify(
        json.loads(json_text, object_hook=_byteify),
        ignore_dicts=True
    )

def _byteify(data, ignore_dicts = False):
    if isinstance(data, str):
        return data

    # If this is a list of values, return list of byteified values
    if isinstance(data, list):
        return [ _byteify(item, ignore_dicts=True) for item in data ]
    # If this is a dictionary, return dictionary of byteified keys and values
    # but only if we haven't already byteified it
    if isinstance(data, dict) and not ignore_dicts:
        return {
            _byteify(key, ignore_dicts=True): _byteify(value, ignore_dicts=True)
            for key, value in data.items() # changed to .items() for Python 2.7/3
        }

    # Python 3 compatible duck-typing
    # If this is a Unicode string, return its string representation
    if str(type(data)) == "<type 'unicode'>":
        return str(data.encode('utf-8'))

    # If it's anything else, return it in its original form
    return data

class Spreadsheet:
    def __init__(self, data):
        self.data = to_dict(data)

    def get_pos(self, key):
        if len(key) != 2:
            raise ValueError("Key should be at most two elements long")
        else:
            x = self.data['headers'].index(key[0]) 
            y = int(key[1]) - 1
        return y, x


    def __getitem__(self, key):
        x, y = self.get_pos(key)
        return self.data['data'][x][y]

    def __setitem__(self, key, value):
        x, y = self.get_pos(key)
        self.data['data'][x][y] = value

def extract(xml):
    """
    Extract only the "DATA" Portion 
    """
    pattern = "<DATA>(.*)<\/DATA>"
    res = re.search(pattern, xml)
    value = res.groups(0)
    return value[0]

def decode_b64(uu_data):
    """
    Return the spreadsheet as JSON string
    """
    return base64.b64decode(uu_data)

def to_dict(prop):
    data = json_loads_byteified(decode_b64(extract(prop)))
    return data

def from_dict(data):
    encoded = base64.b64encode((json.dumps(data)))
    return "<DATA>{0}</DATA>".format(encoded)

def calculate():
    prop = entity.propertyValue('DATA')
    sp = Spreadsheet(prop)
    sp["B3"]= sp["B2"] * 2 + sp["A2"]
    return from_dict(sp.data)

1 Like

Hi @bafe,

could you also share the configuration of the assigned property type for that script?

I have two property types (xml converted with the spreadsheet functionality).
I tried to retrieve the content of the first spreadsheet and simply paste this to the other via a dynamic property plugin.
Sadly it fails. The result (evaluate button of the dynamic property plugin) looks correct but I don´t see anything in the user interface.

Have a nice evening,
Julia

1 Like

Hello Julia, here is the description we wrote in our documentation:

This example is written as a jython dynamic property plugin and can be tested on an openBIS object with a DATA property of type XML. As you can see in the calculate function, you can now reference spreadsheet cells like python dictionaries and use these for computations.

1 Like

Thank you so much!
I didn´t notice I could return a dict but tried to encode it like written above. It works now :slight_smile:

EDIT:
Somehow I saw it once but cannot see any content of this Object anymore…
The log files state “Error: Provided value isn´t a well formed XML document. Content is not allowed in prolog.”.

I tried it with this code now and it´s finally working :partying_face:

Thank you all very very much for your help!

1 Like