Till startsida
Sitemap
To content Read more about how we use cookies on gu.se

Computational Tools for the South African Languages Spoken Language Corpus

(VR- Swedish Research Links)

Jens Allwood, University of Gothenburg
Laurette Pretorius, University of South Africa, Pretoria

Spoken language corpora for the official languages of Southern Africa are currently under development at several institutions (e.g. universities and Language Research and Development Units) in South Africa. This follows from a collaborative research project on the Development of Spoken Language Corpora between Göteborg University (GU) and the University of South Africa (Unisa) (2005-2007). We have now arrived at the stage where the data in these spoken language corpora should be computationally explored and processed for research and language technological purposes. During the planning year, our objectives are: To develop a research plan for the computational manipulation and processing of the Southern African spoken language corpora with a variety of applications in mind (spoken language grammars and dictionaries, corpus planning for language development, the development of learner material).

The planning process in 2008 will involve the following:

  1. To do a needs analysis
  2. To do a survey of the available (commercial and shareware) corpus packages
  3. To evaluate the suitability of the available software
  4. To specify/document the adaptation and modification required by unique mark-up features of the available software
  5. To identify special software needs not readily available for Southern African spoken language corpora

In order to do the needs analysis, two workshops (one in Göteborg and one in South Africa) will be held in the course of 2008.

Page Manager: Pavel Rodin|Last update: 1/28/2011
Share:

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?