Natural Language Toolkit

...software, data sets and tutorials for natural language processing...

Developers Guide

 

From NLTK

Jump to: navigation, search

This page contains information and guidelines for NLTK developers. We are collaborating on various development tasks, and welcome like-minded Python programmers who want to help out.

You can easily download the source code and contribute code. It often helps to have your own SourceForge account.

Project members

A smaller number of people are involved to the point where it makes sense for them to have commit access to the repository. Their sourceforge account shows up in the list of NLTK project members. These people have demonstrated appropriate technical ability and made an extended commitment to the project, typically to add major functionality or to maintain a module over time.

The following expectations apply to project members:

  • maintain high-quality coding standards; follow PEP-8 Style for code; use Epydoc docstrings; update /nltk/test/*.doctest when code is changed
  • follow the guidelines on Package Structure
  • include comment messages with commits
  • ensure changes don't break other code (or update other code as needed); create a branch for experimental changes
  • provide a demo() function in each module to illustrate its functionality; demo code should include import statements so that it can run standalone; demo function should have no arguments, and should be run when the module is called on the command line
  • subscribe to nltk-devel and nltk-commits mailing lists
  • discuss and announce significant changes on nltk-devel (including names for new packages)
  • before modifying code contributed by another developer, seek that person's permission (seek administrator permission if the person does not respond)
  • only commit to /nltk_contrib, except where agreed with project administrators
  • don't commit third-party documents to the repository when readers can be directed to these simply by providing a URL
  • package/__init__.py imports all sibling modules, including api.py and util.py; all modules import directly from other sibling modules as needed (and not via a global import "from nltk.package import X")
  • follow our guidelines on incorporating External Packages.
  • use CR line termination (not DOS CRLF)

The following expectations apply to project administrators:

  • ensure core nltk library (excluding contrib) always works (passes doctests, demotests)
  • update book to reflect changes to library
  • invite new developers once agreed with other administrators

Misc

Personal tools