Last weekend I went to my first hackathon - OxHACK. I can honestly say that I have never been surrounded by so many computer scientists for a long time. It spent some 30ish pleasant hacking hours in a company of a three kind students.
For the given 24 hours of coding time we decided to create a tool which searches through academic papers in *.pdf format, extracts their introduction and conclusion sections. After extracting this data in ASCII, some natural language processing algorithms are applied to compress the sections in just a few sentences giving a smart summary of the work. We used an existing OCR reader written in python "texttopdf", some shell/sed processing (regexp deletion stuff) to clean up the converted ASCII text and the python Alchemy library for natural language processing for section compression. The engine was nicely wrapped into a webpage using Flask, and we also managed to use Mendeley's API to fetch pdf papers for processing from a Mendeley account.
Here is a short description of the project. It should soon be hosted on easyskim.co.uk and we also plan on developing the engine further during our spare time, if any...

Picture from left to right, Josh, Rebecca, me and Keller.

Here is a picture of the team, we also got into the top 10 best projects at this hackathon! Here is the place to once again say, it was a pleasure to work with you guys!
Comments
05 Sep 2015, 10:39