net.sf.jmatchparser.util
Class CachingDownloader

java.lang.Object
  extended by net.sf.jmatchparser.util.CachingDownloader

public class CachingDownloader
extends Object

A utility class used for downloading files from the Internet for parsing. This class provides a built-in disk cache to avoid redownload of the same URLs after a crash or after fixing a bug.


Constructor Summary
CachingDownloader(File cachePath)
          Create a new caching downloader that stores its cache in the given directory.
CachingDownloader(File cachePath, Proxy proxy, String cookies, String useragent, String forwardedFor, long delay, int blacklistedSize)
          Create a new caching downloader with all supported options.
CachingDownloader(File cachePath, String cookies, String useragent)
          Create a new caching downloader with support for custom user agents and cookies
 
Method Summary
 InputStream download(String url, String cacheName)
          Download the given file.
 InputStream download(String url, String postdata, String cacheName)
          Download the given file using user defined POST data.
static String loadStream(InputStream in, String encoding)
          Load an InputStream completely into a String.
 void setDebugStream(PrintStream debugStream)
          Set the debug stream where status information is written to (System.err by default).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CachingDownloader

public CachingDownloader(File cachePath)
Create a new caching downloader that stores its cache in the given directory.

Parameters:
cachePath - directory to store the cached files into

CachingDownloader

public CachingDownloader(File cachePath,
                         String cookies,
                         String useragent)
Create a new caching downloader with support for custom user agents and cookies

Parameters:
cachePath - directory to store the cached files into
cookies - Value for the Cookie header
useragent - Value for the User-Agent header

CachingDownloader

public CachingDownloader(File cachePath,
                         Proxy proxy,
                         String cookies,
                         String useragent,
                         String forwardedFor,
                         long delay,
                         int blacklistedSize)
Create a new caching downloader with all supported options.

Parameters:
cachePath - Directory to store the cached files into
proxy - Proxy to use for downloading
cookies - Value for the Cookie header
useragent - Value for the User-Agent header
forwardedFor - Value for the X-Forwarded-For header. Every * will be replaced by a random number between 0 and 255 for each request
delay - Delay to wait before each download (useful if the target site blocks excess downloaders)
blacklistedSize - Size of a proxy error page. If the response has the given size, the download will be repeated. Only needed if the proxy uses status code 200 for its error pages.
Method Detail

download

public InputStream download(String url,
                            String cacheName)
                     throws IOException
Download the given file.

Parameters:
url - URL of the file
cacheName - Name to use for the file in the cache (must be a valid file name)
Returns:
An input stream to read from the file
Throws:
IOException

download

public InputStream download(String url,
                            String postdata,
                            String cacheName)
                     throws IOException
Download the given file using user defined POST data.

Parameters:
url - URL of the file
postdata - POST data to send to the URL, or null to not use any POST data
cacheName - Name to use for the file in the cache (must be a valid file name)
Returns:
An input stream to read from the file
Throws:
IOException

loadStream

public static String loadStream(InputStream in,
                                String encoding)
                         throws IOException
Load an InputStream completely into a String.

Parameters:
in - Stream to read from
encoding - Encoding to use
Returns:
complete content of the stream
Throws:
IOException

setDebugStream

public void setDebugStream(PrintStream debugStream)
Set the debug stream where status information is written to (System.err by default).

Parameters:
debugStream - New debug stream to use


Copyright © 2011. All Rights Reserved.