class Lexicon
Lexicon
class supports efficient lookup operations for words and prefixes.
As an example of the use of the Lexicon
class, the
following program lists all the two-letter words in the lexicon
stored in EnglishWords.txt
:
int main() { Lexicon english("EnglishWords.txt"); for (string word : english) { if (word.length() == 2) { cout << word << endl; } } return 0; }
This version of Lexicon
is implemented internally using a data structure called a prefix tree or trie (pronounced "try"), which allows for efficient prefix and word searching.
Some runtimes below are expressed in terms of "W" where W represents the length of the word being added.
Constructor | ||
O(1) | Initializes a new empty lexicon. | |
Lexicon(istream) |
O(WN) | Initializes a new lexicon that reads words from the given file or input stream. |
Methods | ||
O(W) | Adds the specified word to the lexicon. | |
addWordsFromFile(istream) |
O(WN) | Reads the given file/stream and adds all of its words to the lexicon. |
O(N) | Removes all words from the lexicon. | |
O(W) | Returns true if word is contained in the lexicon. |
|
O(W) | Returns true if any words in the lexicon begin with prefix . |
|
O(WN) | Returns true if the two lexicons contain the same words. |
|
O(1) | Returns true if the lexicon contains no words. |
|
O(N) | Calls the specified function on each word in the lexicon. | |
O(W) | Removes the specified word from the lexicon. | |
O(W) | Removes all words that begin with the specified prefix from the lexicon. | |
O(1) | Returns the number of words contained in the lexicon. | |
O(1) | Converts the lexicon to a printable string representation. | |
Operators | ||
O(WN) | Returns true if lex1 and lex2 contain the same elements. |
|
O(WN) | Returns true if lex1 and lex2 are different. |
|
O(N) | Outputs the contents of the lexicon to the given output stream. | |
O(WN) | Reads the contents of the given input stream into the lexicon. |
Lexicon(); Lexicon(istream& input); Lexicon(string filename);
EnglishWords.dat
containing a list of words in English. The standard code pattern
to initialize that lexicon looks like this:
Lexicon english("EnglishWords.dat");
Usage:
Lexicon lex; Lexicon lex(filename);
void add(string word);
Usage:
lex.add(word);
void addWordsFromFile(istream& input); void addWordsFromFile(string filename);
Usage:
lex.addWordsFromFile(filename);
void clear();
Usage:
lex.clear();
bool contains(string word) const;
true
if word
is contained in the
lexicon. In the Lexicon
class, the case of letters is
ignored, so "Zoo" is the same as "ZOO" or "zoo".
Usage:
if (lex.contains(word)) ...
bool containsPrefix(string prefix) const;
prefix
.
Like containsWord
, this method ignores the case of letters
so that "MO" is a prefix of "monkey" or "Monday".
Usage:
if (lex.containsPrefix(prefix)) ...
bool equals(const Lexicon& lex) const;
true
if the two lexicons contain exactly the same set of words.
Identical in behavior to the ==
operator.
Usage:
if (lex.equals(lex2)) ...
bool isEmpty() const;
true
if the lexicon contains no words.
Usage:
if (lex.isEmpty()) ...
void mapAll(void (*fn)(string)) const; void mapAll(void (*fn)(const string&)) const; void mapAll(FunctorType fn) const;
Usage:
lexicon.mapAll(fn);
bool remove(string word);
true
if the word was previously contained in the lexicon;
in other words, if a word was removed.
The empty string cannot be contained in a lexicon, so passing the
empty string to this method returns false
.
Usage:
lex.remove(word);
bool removePrefix(string prefix);
true
if the prefix was previously contained in the lexicon; in other words, if any words were removed.
If the empty string is passed, since all words begin with the empty string, all words will be removed and this method will return true
if the lexicon was non-empty prior to the call.
Usage:
lex.removePrefix(prefix);
int size() const;
Usage:
int n = lex.size();
string toString() const;
"{word1, word2, word3}"
.
Usage:
string str = lexicon.toString();