Project¶
Single Project object to centralize OneCode project data, such as the data path, parameter values, registered elements, flow currently running, current running mode, etc.
See reset()
for Project default's initialization.
Attributes:
Name | Type | Description |
---|---|---|
registered_elements |
Set[str]
|
List of elements registered for processing. |
mode |
Union[Mode, str]
|
Control how elements are processed. |
current_flow |
Optional[str]
|
ID of the flow currently running. |
data_root |
str
|
Path to the data folder. |
data |
Optional[Dict[str, Any]]
|
Dictionnary containing the data values from interpreted elements. |
config |
Optional[Dict[str, Any]]
|
Dictionnary containing the project configuration. |
config: Optional[Dict[str, Any]]
property
¶
Get the Project current configuration options.
Config is simply a key-value dictionnary.
current_flow: Optional[str]
property
writable
¶
Get the currently running flow. If no flow is running, None is returned. It is automatically
set when OneCode project is run through the main entry point (i-e
python main.py
or onecode-start
)
data: Optional[Dict[str, Any]]
property
writable
¶
Get the Project current data. Data is typically set either at the start when running in
mode LOAD_THEN_EXECUTE
or incrementaly after each call to any input element.
Data is simply a key-value dictionnary.
data_root: str
property
¶
Get the path to the root of the data folder. See reset()
to know
how the data path is initialized.
mode: Union[Mode, str]
property
writable
¶
Get the currently set mode for the OneCode Project. A string is returned in case of custom modes. See Mode for more information.
registered_elements: Set[str]
property
¶
Get the list of registered elements (InputElement
and OutputElement
).
Once a library is registered, it is required to register the elements that need to be
processed.
By default, it returns all Input/Output Elements of onecode
library.
_set_data_root(data_path)
¶
Protected method to set the data root path. It is unsafe to use this method and change the data path while running the OneCode project.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_path |
str
|
Path to the data root. |
required |
Raises:
Type | Description |
---|---|
NotADirectoryError
|
if the data path does not exist or is not a directory. |
add_data(key, value)
¶
Add a key-value pair to the data dictionnary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
str
|
Unique key to set the attach the value to. |
required |
value |
Any
|
Value corresponding to the given key. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
if the key is empty or None. |
get_config(key)
¶
Get the value corresponding to the key config.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Union[ConfigOption, str]
|
Unique key to get the value from. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
if the key does not exists. |
get_input_path(filepath)
¶
Get the constructed input path for the given file path. If the file path is absolute or null, the path is left unchanged, otherwise the path is considered relative to the data root path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath |
str
|
filename of file path to construct the input path from. |
required |
Returns:
Type | Description |
---|---|
str
|
The constructed input path to the file. |
get_output_manifest()
¶
Get the path to the current flow manifest file, typically
<data_root>/outputs/<flow>/MANIFEST.txt
. If the path does not exist, it is automatically
created.
The manifest file is a collection of output data attributes: there would typically be one entry per output file, each entry containing attributes information. Each line is a JSON entry but the entire file is not a JSON.
Example
{"key": "x", "value": "file1.csv", "kind": "FileOutput", "tags": ["CSV"],
"mimetype": "text/csv" }
{"key": "y", "value": "file2.txt", "kind": "FileOutput", "tags": ["TXT"],
"mimetype": "text/plain" }
...
Returns:
Type | Description |
---|---|
str
|
Path to the output MANIFEST.txt file for the currently running flow. |
get_output_path(filepath)
¶
Get the constructed output path for the given file path. The path is always considered
relative to the data output path (typically <data_root>/outputs/
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath |
str
|
filename of file path to construct the output path from. |
required |
Returns:
Type | Description |
---|---|
str
|
The constructed output path to the file. |
register_element(element_name)
¶
Register the given element as part of the elements to be processed. The element must be of
the form 'onecode_ext.MyInput
Parameters:
Name | Type | Description | Default |
---|---|---|---|
element_name |
str
|
Python name of the element (i-e class name). |
required |
Raises:
Type | Description |
---|---|
ValueError
|
if element is not of the form ' |
reset(keep_registered_elements=False)
¶
Reset the project to its default values:
- the data path is initialized in priority to ONECODE_PROJECT_DATA
if provided in the
Environment variables, otherwise to the data
folder located in the same directory from
where the project is run if existing (typically the OneCode project data folder), otherwise
to the current working directory.
- mode is Mode.CONSOLE
.
- currently running flow and data are None.
- registered elements default to the OneCode ones unless keep_registered_elements
is True.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keep_registered_elements |
bool
|
keep previously registered elements. |
False
|
set_config(key, value)
¶
Add a key-value pair to the config dictionnary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Union[ConfigOption, str]
|
Unique key to set the attach the value to. |
required |
value |
Any
|
Value corresponding to the given key. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
if the key is empty or None. |
write_output(output)
¶
Write data to the output manifest file corresponding to the currently running flow. This function is thread and process-safe, i-e if there is concurrent writing to the manifest file (e.g. parallelization through multiprocessing), writing will be queued so that there is no overwrite or other side-effect. The file will therefore be valid and without data loss.
Although typically this function is automatically called during the OutputElement execution, it is possible to manully call it too to output custom data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Dict
|
Output data to write to the manifest file. |
required |