We’ve been pretty busy getting the new version of Xena ready for release. Finally it was made available yesterday. There are lots of changes in this version, specifically a complete revamp of external licenses used by the software and lots of new features. It’s a major turning point for the software, which is now under the GPLv3.
* Updated license to GPL version 3 (included in COPYING.txt).
* Ability to create raw text versions of document formats for indexing purposes.
* Integration with tesseract OCR software.
* Windows version released with automated installer.
* Normaliser for harvested websites.
* Guesser for ODF, already open format so binary normalise only.
* Advanced Magic Guesser.
* Image Magick Guesser using external convert program.
* Support for audio files in OGG container format using Vorbis, FLAC or Speex codecs.
* Improved MP3 guesser.
* Support for more image formats.
* Major internal re-factoring of external libraries used.
* Libraries now updated and built from source.
* Using a new charset detection library.
* Ability to preserve directory structures.
* Ability to handle files normalised with previous versions of Xena.
* Automatically configure output and log directories.