Introduction¶
Pescador is a library for streaming (numerical) data for use in iterative machine learning applications.
The core concept is the Streamer object, which encapsulates a Python generator to allow for re-use and inter-process communication.
The basic use case is as follows:
- Define a generator function g which yields a dictionary of numpy arrays at each step
- Construct a Streamer object stream = Streamer(g, args...)
- Iterate over examples generated by stream().
On top of this basic functionality, pescador provides the following tools:
- Buffering of sampled data into fixed-size batches (see Buffered Streamers)
- Multiplexing multiple data streams with dynamic (see Multiplexing)
- Parallel processing (see Parallel streaming)
For examples of each of these use-cases, refer to the Examples section.