com.norconex.collector.http.filter
Interface IHttpHeadersFilter

All Superinterfaces:
Serializable
All Known Implementing Classes:
ExtensionURLFilter, RegexHeaderFilter, RegexURLFilter

public interface IHttpHeadersFilter
extends Serializable

Filter a document based on their HTTP headers, before the document content is downloaded.

It is highly recommended to overwrite the toString() method to representing this filter properly in human-readable form (e.g. logging). It is a good idea to include specifics of this filter so crawler users can know exactly why documents got accepted/rejected rejected if need be.

Author:
Pascal Essiembre

Method Summary
 boolean acceptDocument(String url, HttpMetadata headers)
          Whether to accept a URL HTTP headers.
 

Method Detail

acceptDocument

boolean acceptDocument(String url,
                       HttpMetadata headers)
Whether to accept a URL HTTP headers.

Parameters:
url - the URL to accept/reject its headers
headers - HTTP headers associated with the URL
Returns:
true if accepted, false otherwise


Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.