FAQ
From NLTK
Do you have a question that is not answered here? Please send it to Steven Bird
What license does NLTK use?
NLTK is open source software. The source code is distributed under the terms of the Gnu Public License. The documentation is distributed under the terms of the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States license. The corpora are distributed under various licenses, as documented in their respective README files.
What are the plans for further development of NLTK?
In 2007 there will be releases every 1-2 months, according to the development schedule. The toolkit will be continued to be supported in 2008 and beyond with bugfixes and additional functionality, at a rate depending on the level of community support.
What is the difference between NLTK and NLTK-Lite?
Since mid-2005, the NLTK developers have been creating a lightweight version of NLTK, called NLTK-Lite. NLTK-Lite is simpler and faster than NLTK. Once it is complete, NLTK-Lite will provide the same functionality as NLTK (in fact, all of NLTK functionality is now in NLTK-Lite 0.9, and the package is called nltk). Unlike the old NLTK, NLTK-Lite does not impose such a heavy burden on the programmer. Wherever possible, standard Python objects are used instead of custom NLP versions, so that students learning to program for the first time will be learning to program in Python with some useful libraries, rather than learning to program in NLTK. Once it reaches version 1.0 (in late 2007), NLTK-Lite will take over the original NLTK name, and become NLTK 2.0.
I'm planning some long-term research using NLTK; how long is the toolkit going to be supported?
We plan to continue supporting the toolkit for as long as possible. We're publishing a book on NLTK in 2007 and plan to support the toolkit for several years while the book is in active use, and while the developers are employed to teach natural language processing. In particular, bug reports will be attended to as quickly as possible.
I think I found a bug; where do I report it?
Please report any bugs to the NLTK Bug Tracker, giving as much detail as possible. Please include a code sample that permits us to replicate the problem.
Why is Python giving me a syntax error when I use NLTK?
NLTK requires Python version 2.4 or later. If you use an earlier version of Python you will see lots of syntax errors.
How can I install NLTK from the source code repository?
Most users should install NLTK from a distribution. Please see the installation instructions. However, if you need an up-to-the-minute version, then you will have to install NLTK from the source repository. You'll need to have a Subversion client installed on your machine, then check out the main branch from https://svn.sourceforge.net/svnroot/nltk/trunk/nltk. For more details, please see the NLTK Subversion Instructions. Once you've downloaded this, you'll need to run the top level setup.py program to install this version of NLTK on your machine.
How can I find out where NLTK is installed on my system?
Do the following in a Python interpreter session. In this case we see that NLTK is installed in /usr/lib/python2.5/site-packages/nltk
>>> import nltk >>> print nltk <module 'nltk' from '/usr/lib/python2.5/site-packages/nltk/__init__.pyc'>
What papers have been published about NLTK?
NLTK has been presented at several academic conferences, and reviewed in online forums. Please see the Documentation page for more information.
How is NLTK development supported?
NLTK is an open source project that depends mainly on the efforts of volunteers. Occasionally we have funds for a summer intern or TA to work on specified projects. Students and teachers also donate code. We strongly encourage volunteers to get involved: find out more about contributing to NLTK. If you find the toolkit useful, please make a donation to support further development.
How did NLTK start?
The NLTK project began when Steven Bird was teaching CIS-530 at the University of Pennsylvania in 2001, and hired his star student, Edward Loper, from the previous offering of the course to be the teaching assistant (TA). They agreed a plan for developing software infrastructure for NLP teaching that could be easily maintained over time. Edward wrote up the plan, and both began work on it right away. Here is the Version 0.2 release announcement that appeared in September 2001.
If I just "use" NLTK using import statements in Python, am I obliged to publish my source code as well?
No, there is no such obligation. Please see section 2 Basic Permissions in the Gnu Public License.
What is Natural Language Processing?
Please see http://en.wikipedia.org/wiki/Natural_language_processing
Which IT companies are involved in Natural Language Processing?
Please see our page of links for NLP in the IT industry



