Skip to content

PyPI version

User Data in the User's Home Directory

DSP-TOOLS saves user data in the user's home directory, in the folder .dsp-tools. Here is an overview of its structure:

file/folder command using it description
xmluploads xmlupload saves id2iri mappings and error reports
docker start-stack files necessary to startup Docker containers
rosetta rosetta a clone of the rosetta test project
logging.log, logging.log.1 several ones These two grow up to 3 MB, then the oldest entries are deleted
fast-xmlupload several ones shell script for local processing

Remark: Docker is normally not able to access files stored in the site-packages of a Python installation. Therefore, it's necessary to copy the "docker" folder to the user's home directory.

How to Ship Data Files to the User

Accessing non-Python files (aka resources, aka data files) in the code needs special attention.

Firstly, the build tool must be told to include this folder/files in the distribution. In our case, this happens in [tool.poetry.include] in the pyproject.toml file.

Secondly, when accessing the files on the customer's machine, the files inside site-packages should be read-only to avoid a series of common problems (e.g. when multiple users share a common Python installation, when the package is loaded from a zip file, or when multiple instances of a Python application run in parallel).

Thirdly, the files can neither be accessed with a relative path from the referencing file, nor with a path relative to the root of the project.

For example, if you have a structure like this:

dsp-tools
├── pyproject.toml
└── src
    └── dsp_tools
        ├── schemas
        │   └── data.xsd
        ├── __init__.py
        └── dsp_tools.py

it is not possible to do one of the following in dsp_tools/dsp_tools.py:

with open('schemas/data.xsd') as data_file:
     ...
with open('src/dsp_tools/resources/schema/data.xsd') as data_file:
     ...

The reason why these two approaches fail is that the working directory on the user's machine is determined by the directory where DSP-TOOLS is called from - not the directory where the distribution files are situated in.

To circumvent this problem, it was once common to manipulate a package’s __file__ attribute in order to find the location of data files:

import os
data_path = os.path.join(os.path.dirname(__file__), 'schemas', 'data.xsd')
with open(data_path) as data_file:
     ...

However, this manipulation isn’t compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs.

The canonical way is to use importlib.resources:

from importlib.resources import files
# address "schemas" directory in module syntax: needs __init__.py
data_text = files('dsp_tools.resources.schema').joinpath('data.xsd').read_text()
# avoid module syntax when addressing "schemas" directory: no __init__.py necessary
data_text = files('dsp_tools').joinpath('resources/schema/data.xsd').read_text()

Note that depending on how the directory is addressed, an __init__.py file is necessary or can be omitted.

The information on this page is mainly based upon:


Last update: September 12, 2023