Juicer is a web API for extracting text, meta data and named entities from HTML "article" type pages.
For more info visit: http://juicer.herokuapp.com/
Install:
- Maven
- ImageMagick (http://www.imagemagick.org/download/binaries/ImageMagick-6.9.1-7-Q16-x64-dll.exe)
mvn install
- Run
java -Xmx1g -jar target/juicer-2.0.jar
- Now open
http://localhost:8081
in a browser