PYBIS 1.37.2 - Problems with Spreadsheet decoding

ervolk · 21 February 2025 11:27

Dear PyBIS-developer,

I stumbled about some changes in the pybis module since one of the last updates:

I really appreciate that you are striving to simplify the handling of spreadsheets by integrating decoding and JSON handling and outputting a dict directly.

However, many objects can no longer be loaded via

# for example
sample = o.get_object("20250221115502283-49917")

as soon as there is content in the spreadsheet or the content contains special characters that are not recognized by utf-8.

    101 def to_spreadsheet(self, rawValue):
    102     b64 = rawValue[len("<DATA>"):-len("</DATA>")]
--> 103     json_str = base64.b64decode(b64).decode('utf-8')
    104     result = json.loads(json_str)
    105     return Spreadsheet.from_dict(result)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdf in position 2376: invalid continuation byte

If German umlauts, ß or ° occur, the decoding fails.
In the previous versions I was able to adjust the character set:

json_str = base64.b64decode(coded_string).decode("latin-1")
data_dict = json.loads(json_str)

But now I can’t even load the object in its entirety.
And that affects a lot of existing objects. At the same time, I can’t convince users to design spreadsheets without using ° or umlauts (ä,ö,ü)…

I would be happy if you could provide a solution for this in future updates.

Cheers,
Volker

alaskowski · 25 February 2025 09:52

Dear Volker,

Could you send me an example data that breaks on your side? Utf-8 handles German characters without any issues - I tested spreadsheet decoding in pybis beforehand but I may have missed something.

On the other side - which version of Openbis are you using?

Best,
Adam Laskowski

ervolk · 27 February 2025 09:16

Hi Adam,

There is no need for example data. I created a blank sample of “Experimental Step” which I can load via PyBIS. The spreadsheet value is empty at that point.
When I put characters to A1, it still loads fine until that char is either ‘ä’, ‘ö’…

But, as you asked, we are still running openBIS 20.10.8.

Greetings,
Volker

ervolk · 28 February 2025 21:35

Hi Adam,

tried it with openBIS 20.10.11 and it is still the same…

Regards,
Volker

alaskowski · 19 March 2025 07:35

Hello Volker,

I was finally able to reproduce your issue - it seems to be caused by browser character encoding. We implemented the fix in the ELN that would resolve that but before it is released, an updated version of pybis that works around it has been prepared. you can download it here:

Best,
Adam

ervolk · 19 March 2025 08:27

Hi Adam,
Thank you very much for your continued digging and for a viable solution. I had downgraded in the meantime, but your update of PyBIS works as well!

Thank you so much!
Best,
Volker

Topic		Replies	Views
pyBIS and apiV3: how to add and use a spreadsheet Developments	12	612	21 July 2023
Pre-announcement: openBIS bug fix release 20.10.9.1 Announcements	0	26	13 August 2024
pyBIS: 'Missing Scheme' for all dataset actions Developments	7	560	9 February 2023
openBIS 20.10.9 release Announcements	0	40	31 July 2024
pyBIS v1.35.11 datasets Developments	1	173	24 July 2023

PYBIS 1.37.2 - Problems with Spreadsheet decoding

Related topics