Ghana Natural Language Processing (NLP)

Recently, there has been a realization that language technologies have been leaving certain languages behind - including most African languages. While Google Translate offers translation services in several African languages, no Ghanaian languages are included, and even for the included languages the quality has been generally found to be quite poor. Major challenges boil down to lack of good (indeed often any) training data, as well as lack of adoption of major language technologies for local problems.

The problem has been receiving much attention recently, notably yielding a dedicated workshop at the upcoming ICLR in summer of 2020 in Addis Ababa. Notably, techniques for Ethiopian and other East African languages, Southern African Languages and some Nigerian languages have been explored in some depth. No work has been done on Ghanaian languages to the best of our knowledge. Thus, we kicked off Ghana NLP to begin closing this gap.

Ghana NLP is an Open Source Initiative focused on Natural Language Processing (NLP) of Ghanaian Languages, & its Applications to Local Problems. While we will begin work on Ghanaian languages initially, likely focusing on Twi due to number of speakers, our ultimate goal is for our tools to be applicable throughout the West African subregion and beyond.

Broad Agenda

  • Develop better data sources to train state-of-the-art (SOTA) NLP techniques for Ghanaian languages
  • Contribute to adapting SOTA techiques to work better in a lower resource setting such as this one
  • Build functional systems for local applications, e.g., a “Google Translate” for Twi, Ewe, Frafra, etc.

Sponsors and Partners