Translation
From NLTK
Natural Language Processing raises new scientific and engineering challenges for each new language. An excellent way to improve the quality of NLP for a given language - in the longer term - is to have more speakers of the language becoming active members of the NLP community. You could help this process by forming a group to translate the NLTK book into your language.
Ongoing Translation Work
Work is ongoing in the following languages; write to the contact address (mailing list or individual) if you are interested in participating.
| Language | Materials | Contact | Leaders | Corpora |
|---|---|---|---|---|
| Greek | Gr:Book | Steven Bird | Theodosios Chimonidis, Evangelos Himonides | none |
| Hindi | Hi:Book | mailing list | Grishma Govani | none |
| Portuguese | Pt:Book, Guide | mailing list | Lucia Specia, Tiago Tresoldi | tagged text, treebank, no lexicon |
| Spanish | Es:Book | mailing list | Antoni Oliver, Maria Dolores RodrÃguez | none |
| Tamil | Ta:Book | mailing list | Sri Ramadoss M | none |
Obtaining Data
An initial step, before translating the book, is to obtain linguistically annotated data, such as tagged text, a treebank, a lexicon, etc. Please try to get permission for this data (or a sample) to be included with NLTK's data distribution. Consider writing a guide for doing basic NLP tasks in your language (cf. http://nltk.org/doc/guides/portuguese.html)
Translating the book
- Contact Steven Bird to indicate your interest, and to obtain a wiki account
- Ideally there will be multiple people sharing the task, and a new mailing list will be set up to facilitate communication
- Is there an automatic translation service from English to your language? (E.g. Babelfish supports about a dozen languages)
- Create a wiki page Lg:Book where Lg is the ISO 639 code for the language; use this to link to the chapters being translated and to keep a public record of progress
- Create a wiki page Lg:Terminology to hold a table of terminology translations, for consistency across the book; discuss terminology issues on the mailing list (cf our term index)
- Obtain the English source for the chapter you wish to translate, by replacing the filename suffix with .txt, e.g. http://nltk.org/doc/en/tag.txt
- Create a wiki page Lg:Chapter, where Chapter is the chapter name
- Convert the section headings and program listings into wiki text, using == for chapter headings, === for section headings, and <pre>...</pre> for program listings.
- Rework some of the English examples with equivalent examples using available corpora for your language
- Write an appendix, focussing on any issues specific to your language that are not covered elsewhere in the book.



