Last update: 2017-03-15
I am consulting for the Centro Internazionale del Libro Parlato, Feltre, Italy (International Center for the Spoken Book, Feltre, Italy), which is generously sponsoring my work on two open source/free software projects:
During March and April 2017 I will rewrite from scratch the aeneas Web application, as an API server with a GUI on top of it, and new features, like storing each task/job state and I/O files, and better CC auto segmentation, will be added. The code will be published under an open source license, and my work will be supported by the Italian group of TED/TEDx Translators.
At the beginning of March 2017, I released aeneas v1.7.2. Except for bug fixes, I am not planning to directly work on it for the next months. I will think about the next major version (v2.x) and the big changes that it will require.
I am currently building a Python library to automatically segment
sentences into chunks for closed captioning, called lachesis after the name of the
Fate who measured the length of the thread of life. It is built on top of,
and aims at abstracting, other NLP libraries (currently:
UDPipe) and it is
heavily tuned for CC applications.
To celebrate 2017, on 2017-01-01 I put online this guide to help non-tech-savvy Windows users to install Python and run a Python program in the Command Prompt.
Professor Tullio De Mauro recently passed away. He was one of the most influential Italian linguists, compiling one of the most comprehensive dictionaries of modern Italian (with an accompanying abridged version), and more importantly, one of the few sentinels who dared to speak out about the decrease of literacy experienced in Italy, mostly due to lack of funds and ignorance from politicians. Just in November 2016, Prof. De Mauro published a new version of the vocabulary of most common Italian terms (Nuovo vocabolario di base della lingua italiana, or NVdB for short), releasing the list of words in PDF. As a thank-you for his work, and to remember it, I put on GitHub a Python script to extract the text data from the PDF file, and to clean it. The same repository also holds the processed/cleaned files, in UTF-8 encoded, plain text.
Since I grew tired of having to use the cumbersome, JS/AJAX-ridden local
Web interface of my WebCube4 (Huawei E8378) 4G/LTE router, I thought about
automating the process with a shell script. So I captured some traffic with
WireShark and analyzed it, deciphering the relevant HTTP headers, API
node, in a simple but elegant Bash script,
now published as webcube4 on
GitHub. (The process is a story worth on its own...)
A medium-term project I need to resume working on is yael, my Python library for reading/writing/modifying EPUB files. The reading part is essentially done, but the writing part is missing. I also need to rethink its architecture, since the current one is a bit disorganized and inefficient.
During the 2016 summer, I coded a Wikimedia parser that extracts pronunciations in International Phonetic Alphabet (IPA) format from Wiktionary pages. The goal consists in training a grapheme-to-phoneme (aka letter-to-sound) model to improve current open source text-to-speech (TTS) engines, like eSpeak-ng. The project, named wiktts, unfortunately halted for lack of funding/time, but I would like to resume working on it. As a by-product, I would like to produce a better-sounding Italian voice for eSpeak-ng. A very cool further development would be creating a Web application that allows crowdsourcing the evaluation of synthesized words, using the contributed judgements to retrain and refine the TTS voice.
My tiny but very popular project glyphIgo also needs an update. During the Christmas holidays I need to find time for it.
Finally, a few months ago I built a cadence meter for my spin bike with an Hall sensor attached to an Arduino board. The sensor sends the current data (RPM, run time, etc.) to a PC via the USB cable, and some Python code reads the data, stores it, and shows it as a dynamically-updated dashboard in a browser. I also have a speech-recognition module that can be used to command the system via spoken commands. For example, I can change the parameters displayed on the dashboard or control the media player software (I love listening to podcasts/audiobooks while spinning, but I hate sweating the remote control)... The project consists of a very heterogeneous stack (C code for Arduino, Python + ZMQ + Flask + PocketSphinx on PC), which is both interesting but also challenging to set up correctly. I hope to find time to clean the code, write a good set up guide, and release it on GitHub.
Mostly neural network stuff applied to speech recognition, and things called triangular global alignment kernels.
I am playing a bit with Python/Numpy out-of-memory/on-disk computation libraries, as I would like to add that capability to v2.x of aeneas.