Pipeline¶
Process Pipeline¶
-
class
forte.pipeline.
Pipeline
(resource=None)[source]¶ This controls the main inference flow of the system. A pipeline is consisted of a set of Components (readers and processors). The data flows in the pipeline as data packs, and each component will use or add information to the data packs.
-
init_from_config_path
(config_path)[source]¶ Read the configurations from the given path
config_path
and build the pipeline with the config.- Parameters
config_path – A string of the configuration path, which is is a YAML file that specify the structure and parameters of the pipeline.
-
init_from_config
(configs)[source]¶ Initialized the pipeline (ontology and processors) from the given configurations.
- Parameters
configs – The configs used to initialize the pipeline.
-
add_gold_packs
(pack)[source]¶ Add gold packs to the dictionary. This dictionary is used by the evaluator while calling consume_next(…)
- Parameters
pack (Dict) – A key, value pair containing job.id -> gold_pack mapping
-
process
(*args, **kwargs)[source]¶ Alias for
process_one()
.- Parameters
args – The positional arguments used to get the initial data.
kwargs – The keyword arguments used to get the initial data.
-
run
(*args, **kwargs)[source]¶ Run the whole pipeline and ignore all returned DataPack. This is mostly used when you need to run the pipeline and do not require the output but rely on the side-effect. For example, if the pipeline writes some data to disk.
Calling this function will automatically call the
initialize()
at the beginning, and call thefinish()
at the end.- Parameters
args – The positional arguments used to get the initial data.
kwargs – The keyword arguments used to get the initial data.
-
process_one
(*args, **kwargs)[source]¶ Process one single data pack. This is done by only reading and processing the first pack in the reader.
- Parameters
kwargs – the information needed to load the data. For example, if
_reader
isStringReader
, this should contain a single piece of text in the form of a string variable. If_reader
is a file reader, this can point to the file path.
-
Train Pipeline¶
Pipeline Component¶
-
class
forte.pipeline_component.
PipelineComponent
[source]¶ -
initialize
(resources, configs)[source]¶ The pipeline will call the initialize method at the start of a processing. The processor and reader will be initialized with
configs
, and register global resources intoresource
. The implementation should set up the states of the component.- Parameters
resources (Resources) – A global resource register. User can register shareable resources here, for example, the vocabulary.
configs (Config) – The configuration passed in to set up this component.
-
add_entry
(pack, entry)[source]¶ The component can manually call this function to add the entry into the data pack immediately. Otherwise, the system will add the entries automatically when this component finishes.
Returns:
-
flush
()[source]¶ Indicate that there will be no more packs to be passed in, handle what’s remaining in the buffer.
-
finish
(resource)[source]¶ The pipeline will call this function at the end of the pipeline to notify all the components. The user can implement this function to release resources used by this component. The component can also add objects to the resources.
- Parameters
resource (Resources) – A global resource registry.
-
classmethod
make_configs
(configs)[source]¶ Create the component configuration for this class, by merging the provided config with the
default_config
.- The following config conventions are expected:
The top level key can be a special config_path.
- config_path should be point to a file system path, which will
be a YAML file containing configurations.
Other key values in the configs will be considered as parameters.
- Parameters
configs – The input config to be merged with the default config.
- Returns
The merged configuration.
-