Automatic extraction of meta-data from file upload via WebGUI?

Dear all,
I’m trying to understand the data ingestion/upload , and how to fill in meta-data automatically as much as possible.
As a main use-case we would have images from electron microscopes. These are typically saved as .tiff and have a number of relevant metadata encoded in the image and/or as a separate file that contains details such as, e.g., the accelation voltage, field-of-view, etc.

Looking through the documentation, the main tool to setup would be the openBIS Importer Toolset (oBIT), together with the Microscopy plugin which then uses the DropBox feature.
From the documentation, this will be a “big” setup step (for us, at least with the IT ressources at hand), although ultimately the aim…
Another way would be to use the dropbox feature that also allows to run a script similar to data-set-handler-alignment.py and use dataSet.setPropertyValue(“Voltage [kV]”,20) or something similar to populate the relevant metad-data for the image.

I was wondering if it was possible to run such a script when uploading a file manually from the WebGUI (e.g. ExperimentalStep → Upload) as well. Mainly for two reasons: start small and practice getting the data in without having to setup mounts and network drives across systems to be able to put files into a dropbox, or, as we work in different research groups in different institutions, it may not be possible to have a dropbox for all cases and need to rely on the web-interface.

Many thanks
Ulrich

Hello Ulrich,
We have almost the same situation of having many users request to extract metadata directly from their uploads and possibly even create OpenBIS entities automatically.

OpenBIS offers a couple of “native” solutions for doing this:

  • on the server side, you can use dropbox plugins written in Jython or Java. This requires you have access to the OpenBIS instance server to deploy your code. The advantage is that you only need to implement a well defined interface while all other concerns (logging, transaction management, …) are taken care by OpenBIS

  • on the client side, you could work with pybis or the V3 API. This decouples you from the server and may allow for faster development speed, but you have to deal with several concerns by yourself (the same I mentioned above).

We are currently prototyping alternative architectures that tend towards the second approach but “done well”, that is where the user only needs to implement a few methods to obtain metadata during uploads. I’d be happy to keep in touch regarding this

Best

Simone

Dear Simone,

many thanks for your post.
We will definitely work towards some dropboxes for various experimental machines, though I guess that will come with its own challenges regarding different network areas and/or across different groups with their own infrastructure…
As a side-note, is there already some experience with FUSE/Cyberduck for S3 as an intermediate staging area? At least for our setup that might be a way around a lot of issues.

I would also assume that a group of users will be happy to use a pyBis script, though I suspect there will be a group of users who, um, would be a bit confused as there is nowhere to click… I suppose we could wrap this around Django/Heroku but it feels like the development time is better spent elsewhere…
It would be ideal if we could chose to run one of the same Jython scripts in the web-uploader that are also used in the dropbox setup (or at least, the part extracting the metadata).

It would be great if we could stay in touch about this, my (obfuscated) email is: ulrich -DOT- kerzel --AT-- rwth-aachen /DOT/ de

Many thanks
Ulrich

Dear Ulrich,

I would also assume that a group of users will be happy to use a pyBis script, though I suspect there will be a group of users who, um, would be a bit confused as there is nowhere to click… I suppose we could wrap this around Django/Heroku but it feels like the development time is better spent elsewhere…

This is one possible architecture we are thinking about, only I wrote a frontend with Vue.js and the backend using FastAPI instead of django. I am currently not working on the codebase but I’d be happy to share it if there is any interest.

It would be ideal if we could chose to run one of the same Jython scripts in the web-uploader that are also used in the dropbox setup (or at least, the part extracting the metadata).

I think this is already possible using the custom import feature. I did not test this function yet as we are probably not going to use dropbox plugins at all.

1 Like

Hi Simone,

I would be interested in checking out your API + Vue fronted that you implemented. Are you using the new Api v3?
All the best,
Thiago