pyBIS and apiV3: how to add and use a spreadsheet

abecam · 2 January 2023 14:55

Hello,
I am currently exploring the pyBIS api, and would really like to use the spreadsheet property. But I couldn’t find how to upload some data, or decode an existing one (that looks like eyJoZWFkZXJzIjpbIkEiLCJCIiwiQyIsIkQiLCJF…).
I tried to add some data on my own (like pandas csv to xml), but that only break the entity (it does not shows correctly in the ELN UI), without returning any error.
I also couldn’t find any information on how to do that using the API V3 (that I used before).
Thank you,
Alain

rkuzyakiv · 5 January 2023 15:32

Hi Alain.
Thank you for the question.
One of our colleagues working with the pyBIS will get back to you with the answers.
Just as reminded.
When having some specific, openBIS-related questions, you can drop an email to the openBIS dedicated help desk: openbis-support@id.ethz.ch

Regards.
Ross

Swen · 9 January 2023 14:54

Hi Alain,

you can try the following:

import json
import base64
from pandas import DataFrame

# [6:-7] is just a quick and dirty way to get what's inside <DATA> and </DATA>
spreadsheet = json.loads(base64.b64decode(sample.props["general_protocol.spreadsheet"][6:-7]))
df = DataFrame(spreadsheet["data"], columns=spreadsheet["headers"])

To store the (modified) data back to openBIS, you need to export the DataFrame back into the same format:

spreadsheet["data"] = df.to_numpy().tolist()
sample.props["general_protocol.spreadsheet"] = "<DATA>" + str(base64.b64encode(bytes(json.dumps(spreadsheet), encoding='utf-8')), encoding='utf-8') + "</DATA>"

Hope that helps!
Cheers, Swen

abecam · 10 January 2023 16:20

Hi Swen,
Thank you for the answer, it really helps. I could make the reading but not yet the writing yet (it upload something but the resulting sample in OpenBis does not show). But that could well be something wrong I am doing (reading from a tsv using pandas as well). I will work on it more later.
Thanks again,
Alain

JuliaD · 19 July 2023 14:04

Hey @abecam ,

did you save your sample after changing it?

sample.props["general_protocol.spreadsheet"] = "<DATA>" + str(base64.b64encode(bytes(json.dumps(spreadsheet), encoding='utf-8')), encoding='utf-8') + "</DATA>"
sample.save()

Julia

JuliaD · 19 July 2023 14:11

Thank you @Swen for posting this!

I´m also using this as I want to show some results from another script in a nice table.
However, I can only change an already edited spreadsheet, not an unused one (which I need as it shouldn´t be available for editing by users).

Any ideas how to do this?

Swen · 19 July 2023 22:23

Hi @JuliaD,
fetching the spreadsheet is done in three steps:

get the general_protocol.spreadsheet property and strip away <DATA> and </DATA>:
sample.props["general_protocol.spreadsheet"][6:-7]
base64 decode the remaining part
transform the json string into Python data structure with json.loads

Creating a spreadsheet is basically just the other way round.

create a Python dictionary (eg. spreadsheet) with the keys headers and data (I guess the headers 'style', 'meta', 'width'are not needed, but I am not sure)
transform the Python dictionary into a JSON string with json.dumps
transform the JSON string into bytes (JSON is always utf-8)
base64 encode the bytes
add <DATA> and </DATA> at the beginning and end
store everything into the general_protocol.spreadsheet property
save the sample

I guess the trickiest part is to get the spreadsheet data structure right. Create a sample spreadsheet in the ELN UI, decode it as described above and then try to create your own, using the exact same structure.

Cheers and good luck!

Swen

JuliaD · 20 July 2023 11:29

Thanks to your explanations I was able to do this

I tried it out and it works with ‘headers’ and ‘data’ only. ‘style’, ‘meta’ and ‘width’ are not needed.

Note:
The size of the given spreadsheet doesn´t matter. The openBIS spreadsheet in the User Client will always show columns A to AD and rows 1 to 30, even if your given structure contains only 5 columns and 2 rows.
E.g. mine was:

spreadsheet = {'headers': ['Name', 'ID', 'SizeX', 'SizeY', 'SizeZ'],
               'data': [['99.tif', '997777', '9', '999', '9999'],
                        ['55.tif', '5', '555', '5', '55']]}

bafe · 20 July 2023 13:31

Hello Everyone,
here’s an example for how we used the spreadsheet for dynamic properties:

import base64
import re
import collections
import json


def json_load_byteified(file_handle):
    return _byteify(
        json.load(file_handle, object_hook=_byteify),
        ignore_dicts=True
    )

def json_loads_byteified(json_text):
    return _byteify(
        json.loads(json_text, object_hook=_byteify),
        ignore_dicts=True
    )

def _byteify(data, ignore_dicts = False):
    if isinstance(data, str):
        return data

    # If this is a list of values, return list of byteified values
    if isinstance(data, list):
        return [ _byteify(item, ignore_dicts=True) for item in data ]
    # If this is a dictionary, return dictionary of byteified keys and values
    # but only if we haven't already byteified it
    if isinstance(data, dict) and not ignore_dicts:
        return {
            _byteify(key, ignore_dicts=True): _byteify(value, ignore_dicts=True)
            for key, value in data.items() # changed to .items() for Python 2.7/3
        }

    # Python 3 compatible duck-typing
    # If this is a Unicode string, return its string representation
    if str(type(data)) == "<type 'unicode'>":
        return str(data.encode('utf-8'))

    # If it's anything else, return it in its original form
    return data

class Spreadsheet:
    def __init__(self, data):
        self.data = to_dict(data)

    def get_pos(self, key):
        if len(key) != 2:
            raise ValueError("Key should be at most two elements long")
        else:
            x = self.data['headers'].index(key[0]) 
            y = int(key[1]) - 1
        return y, x


    def __getitem__(self, key):
        x, y = self.get_pos(key)
        return self.data['data'][x][y]

    def __setitem__(self, key, value):
        x, y = self.get_pos(key)
        self.data['data'][x][y] = value

def extract(xml):
    """
    Extract only the "DATA" Portion 
    """
    pattern = "<DATA>(.*)<\/DATA>"
    res = re.search(pattern, xml)
    value = res.groups(0)
    return value[0]

def decode_b64(uu_data):
    """
    Return the spreadsheet as JSON string
    """
    return base64.b64decode(uu_data)

def to_dict(prop):
    data = json_loads_byteified(decode_b64(extract(prop)))
    return data

def from_dict(data):
    encoded = base64.b64encode((json.dumps(data)))
    return "<DATA>{0}</DATA>".format(encoded)

def calculate():
    prop = entity.propertyValue('DATA')
    sp = Spreadsheet(prop)
    sp["B3"]= sp["B2"] * 2 + sp["A2"]
    return from_dict(sp.data)

JuliaD · 20 July 2023 16:54

Hi @bafe,

could you also share the configuration of the assigned property type for that script?

I have two property types (xml converted with the spreadsheet functionality).
I tried to retrieve the content of the first spreadsheet and simply paste this to the other via a dynamic property plugin.
Sadly it fails. The result (evaluate button of the dynamic property plugin) looks correct but I don´t see anything in the user interface.

Have a nice evening,
Julia

bafe · 21 July 2023 06:30

Hello Julia, here is the description we wrote in our documentation:

This example is written as a jython dynamic property plugin and can be tested on an openBIS object with a DATA property of type XML. As you can see in the calculate function, you can now reference spreadsheet cells like python dictionaries and use these for computations.

JuliaD · 21 July 2023 08:06

Thank you so much!
I didn´t notice I could return a dict but tried to encode it like written above. It works now

EDIT:
Somehow I saw it once but cannot see any content of this Object anymore…
The log files state “Error: Provided value isn´t a well formed XML document. Content is not allowed in prolog.”.

JuliaD · 21 July 2023 09:57

I tried it with this code now and it´s finally working

Thank you all very very much for your help!

Topic		Replies	Views
PYBIS 1.37.2 - Problems with Spreadsheet decoding Developments	6	96	25 March 2025
Interact with openBIS via python from scripts Data Management	25	988	5 February 2025
pyBIS v1.35.11 datasets Developments	1	173	24 July 2023
openBIS 20.10.9 release Announcements	0	40	31 July 2024
Create Masterdata using pyBIS Installation & maintenance masterdata	1	251	3 October 2023

pyBIS and apiV3: how to add and use a spreadsheet

Related topics