Friday, February 15, 2013

Code for File-type Identification

Off late I've got many requests to share/queries about the the code used in our work 'Statistical Learning for file-type Identification' [pdf]. Unfortunately, most of the setup and experiments were run using hacky python scripts which I no longer have access to. However, I was able to find one tool that uses 2-gram based features from files to identify the file-type. Here is the source code.