Crawler Server Installation and Configuration
This document describes installation and configuration of Crawler Server.
Installation
Download latest binary release (ZIP file) and extract it to some directory, such as /opt/big-data-crawler-1.0.0
Running
Run crawler-server in bin directory without any parameters to print usage information.
/opt/big-data-crawler-1.0.0/bin/crawler-server
Usage: crawler-server <options> Commands: -c <config file> Start Crawler server -V, --version Print Crawler version Optional parameters: -l <file> Log file. Default is /tmp/crawler/crawler.log -v <level> Logger verbosity: DEBUG / ALL, INFO (default), WARN, ERROR
To start the server, run
/opt/big-data-crawler-1.0.0/bin/crawler-server -c /opt/big-data-crawler-1.0.0/conf/crawler-server.cfg
Configuration
Example configuration file is located in the installation directory (/conf/crawler-server.cfg). The following parameters are available.
Message Broker Parameters
Parameter | Description |
---|---|
mq.type | Message broker type. Currently only "RabbitMQ" is supported. We can add more types in future releases. |
rmq.host | RabbitMQ "host:port" tuples (one tuple per line). For example, "localhost:5672". |
rmq.user | RabbitMQ user. For example, "harvest". |
rmq.password | RabbitMQ password. For example, "harvest1234". |
Other Parameters
Parameter | Description |
---|---|
web.port | Embedded web server port. Default value is 8001. |