Organizing data¶
It is strongly advised to setup your input/output data files hierarchy relative to a root folder, so that when you deploy your application, experts can simply specify a different root data path and things will work out great. For instance:
import os
from onecode import file_input, Project, Mode
project = Project()
project.mode = Mode.EXECUTE
file = os.path.join('region', 'model.h5')
x = file_input('x', file)
region/model.h5
to your Project data path.
print(f'Are path equals? {x == os.path.join(project.data_root, file)}')
Are path equals? True
From there, any experts can have a data folder with the same hierarchy anywhere on their disk and launch the script by simply changing the data path at runtime:
ONECODE_PROJECT_DATA=/path/to/my/data python main.py
On the other hand, as soon as you specify an absolute path for your input files, the Project data path is ignored. There could be some special cases for doing that, but most of the time you should use relative paths.
Note
Input elements such as csv_reader
and file_input
make the path relative to
Project().data_root
.
Output elements such as csv_output
, file_output
, image_output
and text_output
make the
path relative to {Project().data_root}/{Project.current_flow}/outputs
.
Tip
How is the Project data path determined? The data path is initialized according to the following rules ordered by priority:
- to
ONECODE_PROJECT_DATA
if provided in the Environment variables - to the
data
folder located in the same directory from where the project is run if existing (typically the OneCode project data folder) - to the current working directory for all other cases