| 1 |
====================== |
|---|
| 2 |
The `datastore` module |
|---|
| 3 |
====================== |
|---|
| 4 |
|
|---|
| 5 |
The `datastore` module aims to present a consistent interface for persistent |
|---|
| 6 |
data storage, irrespective of storage back-end. The main intended usage is for |
|---|
| 7 |
caching of intermediate results. If it takes a long computation time to |
|---|
| 8 |
calculate an object's data, it can save the data to a datastore the first time |
|---|
| 9 |
it is run, and then if an identical object is needed in future, it can just |
|---|
| 10 |
retrieve its data from the store and not have to compute it. |
|---|
| 11 |
|
|---|
| 12 |
Since it is intended for objects to be able to store part or all of their |
|---|
| 13 |
internal data, the storage/retrieval keys are based on the object identity and |
|---|
| 14 |
state. |
|---|
| 15 |
|
|---|
| 16 |
We assume that an object's identity is uniquely defined by its type (which may |
|---|
| 17 |
also depend on the source code revision number) and its parameters, while its |
|---|
| 18 |
state is defined by its identity and by its inputs (we should possibly add some |
|---|
| 19 |
concept of time to this). |
|---|
| 20 |
|
|---|
| 21 |
Hence, any object (which we call a 'component' in this context) must have |
|---|
| 22 |
the following attributes: |
|---|
| 23 |
|
|---|
| 24 |
``parameters``: a NeuroTools ``ParameterSet`` object |
|---|
| 25 |
``input``: another component or ``None``; we assume a single input for |
|---|
| 26 |
now. A list of inputs should also be possible. We need to be |
|---|
| 27 |
wary of recurrent loops, in which two components both have |
|---|
| 28 |
each other as direct or indirect inputs). |
|---|
| 29 |
``full_type``: the object class and module |
|---|
| 30 |
``version``: the source-code version |
|---|
| 31 |
|
|---|
| 32 |
There are two advantages to using the ``datastore`` module rather than just |
|---|
| 33 |
using, say, ``shelve`` directly:: |
|---|
| 34 |
|
|---|
| 35 |
1. You don't have to worry about keeping track of the key used to identify |
|---|
| 36 |
your data in the store: the ``DataStore`` object takes care of this for |
|---|
| 37 |
you. |
|---|
| 38 |
2. You can use various different back-ends to store your data (local |
|---|
| 39 |
filesystem, remote filesystem, database) and to manage the keys |
|---|
| 40 |
(``shelve``, a database, the filesystem), and the interface remains the |
|---|
| 41 |
same. |
|---|
| 42 |
|
|---|
| 43 |
|
|---|
| 44 |
Creating a datastore |
|---|
| 45 |
~~~~~~~~~~~~~~~~~~~~ |
|---|
| 46 |
|
|---|
| 47 |
Two different storage backends are currently available, ``ShelveDataStore`` and |
|---|
| 48 |
``DjangoORMDataStore``, and more will be added in future. It is also intended to |
|---|
| 49 |
be easy to write your own, custom storage backend. Whichever backend is used, |
|---|
| 50 |
after you have created your datastore, the interface is the same. For this |
|---|
| 51 |
example we will use the ``ShelveDataStore``:: |
|---|
| 52 |
|
|---|
| 53 |
>>> from NeuroTools.datastore import ShelveDataStore |
|---|
| 54 |
>>> datastore = ShelveDataStore(root_dir="/tmp") |
|---|
| 55 |
|
|---|
| 56 |
Here we specify that the ``shelve`` files will be created in ``/tmp``. Now let |
|---|
| 57 |
us create a simple component whose data we wish to store:: |
|---|
| 58 |
|
|---|
| 59 |
>>> class SimpleComponent(object): |
|---|
| 60 |
... def __init__(self, parameters, |
|---|