Harvest Server Installation and Configuration
This document describes installation and configuration of Harvest Server.
Installation
Download latest binary release (ZIP file) and extract it to some directory, such as /opt/big-data-harvest-1.0.0
Running
Run harvest-server in bin directory without any parameters to print usage information.
/opt/big-data-harvest-1.0.0/bin/harvest-server
Usage: harvest-server <options> Commands: -c <config file> Start Harvest server -V, --version Print Harvest version Optional parameters: -l <file> Log file. Default is /tmp/harvest/harvest.log -v <level> Logger verbosity: DEBUG / ALL, INFO (default), WARN, ERROR
To start the server, run
/opt/big-data-harvest-1.0.0/bin/harvest-server -c /opt/big-data-harvest-1.0.0/conf/harvest-server.cfg
Configuration
Example configuration file is located in the installation directory, (/conf/harvest-server.cfg). The following parameters are available.
Message Broker Parameters
Parameter | Description |
---|---|
mq.type | Message broker type. Currently only "RabbitMQ" is supported. We can add more types in future releases. |
rmq.host | RabbitMQ "host:port" tuples (one tuple per line). For example, "localhost:5672". |
rmq.user | RabbitMQ user. For example, "harvest". |
rmq.password | RabbitMQ password. For example, "harvest1234". |
Registry (Elasticsearch) Parameters
Parameter | Description |
---|---|
es.url | Elasticsearch (Registry) URL. For example, "http://localhost:9200". |
es.index | Elasticsearch (Registry) index name. For example, "registry". |
es.authFile | Optional parameter. Elasticsearch authentication file. For example, "/etc/pds-registry/auth.cfg" |
Other Parameters
Parameter | Description |
---|---|
web.port | Embedded web server port. Default value is 8005. |
harvest.storeLabels | Optional parameter. Store original PDS labels (XML) as BLOBs. Default value is "true". |
harvest.storeJsonLabels | Optional parameter. Store PDS labels in JSON format as BLOBs. Default value is "true". |
harvest.processDataFiles | Optional parameter. Extract basic file information and calculate MD5 hashes of all data files referenced in a PDS label. |