Softnet Systems, Inc. Speech Recognition Specialists

Hints

Phrase Lists

Building Phrase Lists

Phrase lists can be obtained from many different sources. Glossaries, on-line references including Wikipedia, etc. are good references.

These lists tend to have words and phrases that are regional/geographic and other words that are dependent on your profession. Also needed for a good list are pairs/triplets of words where NaturallySpeaking may not be as accurate as it could be, for instance, the phrases "left ear" and "right ear" which are sometimes confused (even in the Medical) with "left year" and "right year". (In fairness, "year" is an extremely common word in medical dictation, as doctors routinely dictate your age -- even if you lie about it.)

In the metropolitan Phoenix area, a phrase list would include items such as:

Camelback
Pinnacle (this word is almost always capitalized in Phoenix, but is normally lower-case in English as a whole)
Indian School Rd.
America West Airlines

In Arizona, the acronym "ACCCHS", a health care (Medicaid) plan, is pronounced "access" by 99% of all medical personnel. So, the following is added to medical vocabularies. But it doesn't make sense to add this for a North Dakota health care provider who will not use the term.

AHCCCS\access

Or to be able to continue to use "access" as an ordinary word, some will add this as:

AHCCCS\access insurance

As an example of a profession-dependent phrase, take "to buy for", which might be a normal phrase for a corporate purchasing agent.

However, a construction engineer/architect/carpenter probably means:

2" x 4"

if the same sounds are made. So, for those of you in construction, you might have a list such as:

2" x 4"\two by four
2" x 6"\two by six
12" x 12"\twelve by twelve
Lamella roof
load-bearing
jalousie window
punch list
...

You can also use this technique to get the correct capitalization of words. Adding a list such as:

FINDINGS:\findings :
PROCEDURE:\procedure :
SUBJECTIVE:\subjective :

lets you avoid saying "all-caps-on findings colon all-caps-off" and just say "findings colon" instead. (gastroenterologists DO have a problem with the confusion of the word "colon" and the punctuation symbol ":". We have solutions for this!)

If you have substantial text to analyze, there are tools to analyze it. Instant Text is available from Textware Solutions, www.textware.com. This is a tool aimed at professional transcriptionists and you will only need to use a fraction of its capabilities. This is most useful if you can identify at least 1MB of text files(.txt format) to analyze. The Instant Text program is ideal for this - it costs about $200, is small - and produces a list of your most frequently used phrases. This list can be edited using Word Perfect or Word into the format needed for NaturallySpeaking, then introduced into NaturallySpeaking in a couple of minutes. (Alternately, if you supply text, Softnet Systems can run this analysis and return a phrase list to you - cost is $99 for analyzing a single set of files.)

Instant Text won't pick out quite the same things as the manual approach, and you may need to delete some of the phrases it finds. But it is an easy, quick way to identify phrases where a large body of text is readily available.

Historically (e.g. prior to about Release 8), some of the most successful users of NaturallySpeaking had lists of hundreds or thousands of phrases. For most individuals, a few hundred phrases were appropriate. Almost all of the experience with these phrase lists is anecdotal, and more recent versions of Dragon need phrases only for unique phrases, not for general English phrases.

Each should build up a list of phrases that you find useful. Share them with colleagues and friends if appropriate. In a large organization, it will prove useful to have a set of phrases that relate to the organization such as the names of groups within the organization, key names within the organization, common addresses, and acronyms frequently used within the organization that may have little meaning outside of the organization.

These lists can be included in custom vocabularies -- contact us for more details on these, which provide even higher accuracy.

top

Exporting/Maintaining Phrase Lists

Maintaining your list(s) of phrases isn't fun but it pays off. A key to high accuracy is to have a file with YOUR unique words and PHRASES. A few people have 1,000's of words in their phrase lists! These lists give Dragon more clues about what is important to YOU. For instance, if you deal with Tomm Smythe and have that name in your phrase list, you won't have to correct Dragon nearly so often as your phrase "Tomm Smythe" tells Dragon that for you, these words go together.

Other people with a similar vocabulary may get good use out of phrase lists. If you send such lists from your profession, I'll post them on this web site. I've gotten requests from geologists, botanists, tree surgeons, car restorers, auto parts dealers, as well as almost every major area of medicine and law.

Some persons treasure these lists so much that copies go in their vaults for safe-keeping. We advise regular backups of your speech files which include these lists.

At the beginning of the Millenium (1/1/2000) Joel Gould posted two utilities, "GetWords" and "PutWords", for the world to use freely. These make it easier to maintain these lists. You can use these utilities, beginning with Release 6 you can export and import your custom words from the Dragon menus, or you can maintain the list manually (not recommended except in certain large organizations).

top

GetWords/PutWords to Maintain a Phrase List

This is primarily needed for releases 1-5. However, the ability to retain pronounciations is in these utilities but not in the built-in export/import of word lists so the technique still has validity on releases 6 and 7.

  1. If you don't already have GetWords on your system go to the GetWords/PutWords page on Joel Gould's web site. Download both utilities and install them on your system. Both utilities are very easy to use, but if you need help Joel Gould explains how to use them right on the download page.
  2. Start NaturallySpeaking and open the user files with the custom words and phrases you would like to put into a list. Run GetWords. If you have trained any of your custom words or phrases you should have the " Include pronunciations in the word list" checked. An easy way to organize your phrase lists is to name them with the date followed by the user name. For example:

    2000-7-12 user 1.txt

    Notice that the year comes first in the date. This is done so that if you have lists from multiple years the Windows Explorer lists them chronologically.

  3. Open the newly created text file and scan your phrase list. Delete any words that are no longer useful or were created mistakenly by erasing the entire line with that word on it. When you are done editing your phrase list, save it.
  4. Occasionally someone will have multiple user files with different custom words or phrases. To maintain one consolidated phrase list do the following: Use GetWords to create a custom word list from one user then use PutWords to merge that word list into the second user files (you must open the second user files in NaturallySpeaking to do this). Finally, use GetWords for the second set of user files to gather one complete list. Edit it as necessary and save it.

top

Maintaining a Phrase List Manually

Make this a TEXT (.txt) file so that you can import it easily to a new user.

Save the file off-line so that if your system crashes you have a copy.

The format of the file is simple -- one word or phrase per line.

If it is just a new word or phrase, then all that goes on the line is the new word/phrase.

If there is a different spoken form than written form, then the line should contain:

<written form>\spoken form

That is, the spoken form follows the written form, separated with a "\" character. Or, just the word or phrase by itself. Do not have trailing blanks as they may confuse Vocabulary Builder.

Some prefer to keep related items in separate files. This is OK. You can add these files one at a time using Vocabulary Builder. Just remember to select the box to not rebuild the language model, and don't "Add" these files on the Vocabulary Builder.

Some keep their word lists in alphabetic order -- this is nice and neat but NaturallySpeaking doesn't care.

With the Professional/Legal/Medical Editions of NaturallySpeaking, one can have a command to add phrases to this file as you encounter them. Watch for this as a coming attraction!

With Release 7, there is a command "Add Phrase to Vocabulary" to add phrases to DNS as you encounter them. We suggest you use that command often as you encounter repetitive and unusual text.

top

Multiple User/Enterprise Phrase Lists

In an enterprise (more than one user in an organization) there tend to be common words, phrases, and acronyms that can be shared between all users of any version of NaturallySpeaking. These include names of people within the organization, key customers and suppliers, titles, product names, addresses, phone numbers/prefixes, marketing slogans, etc. Much of this information can be exported from Personal Information Managers such as MS-Outlook, ACT!, GoldMine, etc.

Some organizations have glossaries of terms used within the business. A word list with these items, one "phrase" per line, can help all users of NaturallySpeaking obtain more accurate results with less effort. The list can be imported into each topic or vocabulary for each user using the Vocabulary Builder. This can help significantly with unusual names, unusual capitalizations, and common industry terms.

Alternately, starting with the Release 7 Professional Series products, there is a utility, nsapps, for distributing these word lists. By Release 9 the "Data Distribution" function was added for Professional, Legal, and Medical users. It is strictly an installation/maintenance feature for distribution. The key is to build and use word/phrase lists to increase accuracy and speed.

In some organizations, multiple word/phrase lists covering different aspects of the enterprise will be necessary.

top


Hints, Recommendations

New to Speech Recognition
User Profiles
Dragon NaturallySpeaking Hints

Products

Dragon Medical Practice Edition
Dragon NaturallySpeaking
Upgrades for Dragon NaturallySpeaking
Books, Videos
Microphones
Clearance Items
Ordering Options

Services

Training, Consulting
Customization
Demonstrations
Sales and Support

Information

About
Accessibility
Contact
Ordering Options





Home » Hints » Phrase Lists