# gbBotDetectPlugin plugin ## Overview The `gbBotDetectPlugin` is a symfony plugin that provides bots detection facility of web requests. **WARNING**:: Since version 1.0.0 gbBotDetectPlugin **is not backward compatible**. The most important changes are: * The facility is provided at actions from the sfRequest object and not from sfUser object. * The configuration directives have been updated. Please read on for more information. ## Installation 1.1 Using the Symfony plugin installation task: ./symfony plugin:install gbBotDetectPlugin 1.2 Using the svn method cd plugins svn co http://svn.symfony-project.com/plugins/gbBotDetectPlugin 2.Enable the plugin into your ProjectConfiguration Edit your application *config/ProjectConfiguration.class.php* to enable the gbBotDetect plugin, and add the line below in the *setup* function (if not automatically added by the install task) [php] $this->enablePlugins('gbBotDetectPlugin'); ## Configuration gbBotDetect plugin can be configured in two levels: * The app.yml -global or application- configuration file, used for regular configuration * The bot_detect_factories.yml global configuration, used for advanced configuration ## Regular configuration - `app.yml` The conifuration directives in app.yml, are specified under the **gbBotDetectPlugin** key. The following configuration directives are supported in the `app.yml`: [yml] all: gbBotDetectPlugin: listtype: basic Explanations: * **listtype** - The list type to use when searching for bots. See *Bot list types* section, below. ## Bot list types For performance reasons, 2 built-in types of bot list exist, the **basic** (default), which has a small list of bots and the **extended**, which has a large list of bots. To modify the list type to the extended, write at the app.yml: [yml] all: gbBotDetectPlugin: listtype: extended It is also possible to add your custom list types by just creating a <fileprefix><custom_list_type_name>.yml file at the <basedir> and specifying the *listtype* to "custom_list_type_name" at the app.yml. Where <fileprefix>="bots.", <basedir>="data/" (global), but they can be configured (see `Advanced configuration` section below). For example if you want to define a `mylist` list type, create a file `bots.mylist.yml` under data/ dir (in you application symfony root) and specify at `app.yml` : [yml] all: gbBotDetectPlugin: listtype: mylist ## Usage There two basic ways to use the gbBotDetect facility: * the `sfRequest` object, from an action * the `sfContext` object, from filters, or anywhere else ## sfRequest object The following methods are available as extensions to the sfRequest object: [php] $request->isBot() Will return true if the user is a know *bot*, and false otherwise. [php] $request->whatBot() Will return the recognized bot id (as specified through the bots.*.yml id key), or false when not found. The same methods are available in a template by using the symfony `$sf_request` template variable. ## sfContext object The bot detect facility is available, thought the sfContext singleton object, to any place other than actions (In actions you should always use sfRequest object). Also note that sfContext::getInstance() should, in general, be avoided; sfContext object should be obtained by local object getters when appropriate. For example to get the context in a filter use `$this->getContext()`. To get an instance of gbBotDetect object: [php] $context->getBotDetect() The following utility methods are provided by gbBotDetect class: [php] $gbBotDetect->whatBot($useragent, $ip, $type = null) Will return the bot id or false if not found, for the provided `$useragent` string, `$ip` address and bot list `$type` (or the listtype from app config, when type not provided) [php] $gbBotDetect->getMeta($botid) Will return an associative array of bot meta data as specified in the bot list. Note that the bot meta is not mandatory for every bot definition. See *Bot list definition* section bellow. ## Advanced configuration - bot_detect_factories.yml The gbBotDetect plugin manages it's advanced configuration through the mechanism of symfony factories. This provided more flexibility than the app.yml method, like using configurable classes for gbBotDetect object. Moreover it features a standard way of config caching that is both efficient and fast. In order to configure the bot_detect factory, copy the plugin_dir/config/bot_detect_factories.yml to the global or application config/ directory and specify the wanted configuration entries. The configuration entries and their default values are: [yml] all: bot_detect: class: gbBotDetectFile param: basedir: data fileprefix: 'bots.' defaultmatch: patterni Explanations: * **class** - The gbBotDetect backend class. For the moment only file backend is supported. * **basedir** - The base dir relative to sf_root_dir where the bot definition files live. This could also be a full path. Configuration variable substitution is also done (e.g. %%SF_ROOT_DIR%%) * **fileprefix** - The prefix for the bot definition filenames. The final name will be in the format `PREFIX` `TYPE`.yml * **defaultmatch** - The default match operator for the bot list entries, when one is not specified. See *Bot list definition* for supported operators. ## Bot list definition The format of the bots definitions list is a yaml file with the following entries: [yml] Bots: %botId%: agent: %matchstring% ip: [can be null] match: [optional] regexp (default), regexpi (case insensitive), exact, pattern (*,? meta chars), patterni (case insensitive) meta: [optional, also all subfields optional and arbitrary] url: %boturl% co: %botcompany% co_url: %company url% type: Bot|Crawler|.... ## TODO (28/02/2012) * Add a update bots list task, possibly read from http://user-agent-string.info/ file format, and/or Add UASParser convertion script * Add Database version with Administration Module * Add database backend with import, export tasks