Free Terminology Applications
|
Table of Contents
|
WEB APPLICATIONS
Term extraction
- TerMine: identifies key phrases and terms in texts. Offers single and batch document analysis.
- Terminology Extraction by Translated.net Labs: Provides the frequency of words in a given text (you will need to paste it, no upload feature). Supported languages: English, Italian and French.
- TermExtractor Beta by LCL Group: Analyses and extracts relevant terminology from a corpus of documents. Requires registration (demo restricted to 5MB).
- KeyWords Extractor v. 1 at Lextutor: determines the keywords in a corpus by comparing frequency per word to frequency in the Brown corpus.
- WebCorp Word List Generator by the RDUES: extracts relevant terms from web pages based on frequency per word.
- Filtered Word Frequencies For English Language: finds the most frequent rare English language words from a website or text.
Concordancers
- Online concordancers at Lextutor (English and French).
- Collins Corpus Concordance Sampler: composed of 56 million words of contemporary written and spoken text
- Web as Corpus: Web concordancer in 34 different languages, English Web corpus (2006 and 2007) and counter for web results matchings from Yahoo! and Live Search.
- CorpusEye: multi language concordancer (some of the features are private).
- WebCorp by the RDUES: (English) concordancer with advanced search options.
- WebTCE: Translation Corpus Explorer for the Web.
- BwanaNet: multilingual web concordancer for the IULA Technical Corpus (English, Spanish, Catalan, French and German).
Summarizers
- Summ-it Summarisation Applet Version 1.1, by the University of Surrey.
- Automatic Text Summarizer by LTRC (Language Technologies Research Centre).
- Summarizer by Pertinence Mining: document upload and other advanced features.
- SweSum by Martin Hassel and Hercules Dalianis (multi language, only web pages).
- Pertinence Summarizer by Pertinence Mining: multi language, advanced options (some of them premium).
Corpus analysis tools
- Uplug: collection of tools for linguistic corpus processing, word alignment and term extraction from parallel corpora
- Glossanet: allows users to retrieve words or sequences of words from a pre-selected pool of daily newspapers (multi language).
Terminology Management
- Termbases: Multilingual terminology management online service, where you can create term bases, dictionaries and glossaries. You can also add pictures, synonyms or make your dictionaries public, but no import/export feature.
Semantic Visualization
- Structural Semantic Interconnections by LCL Group: creates semantic graphs given a set of words in context and a lexical knowledge base. Requires registration.
- Visuwords: online graphical dictionary that produces diagrams similar to a neural net.
- Visual Dictionary Online: Merriam-Webster online visual dictionary with access to more than 6,000 images.
- The Semantic Atlas, by the Institut des Sciences Cognitives and Le laboratoire L2C2: Includes, among others, an English Synonyms and a Contexonyms Dictionaries.
- Wikimindmap: with 10 million articles there's no doubt Wikipedia is a leading reference. This great webapp will help you visualize the semantic relations between articles/entries.
Text analysers
- Topicalizer: text analysis of a document specified by a URL or a plain text, including word, sentence and paragraph count, collocations, syllable structure, lexical density, keywords, readability and a short abstract.
- Sentence and Paragraph Breaker by Scott Piao: identifies sentences and paragraphs.
- Sentence Extractor at Lextutor: replaces all terminal punctuation with a double line break.
Bitext aligners
- Text Alignment Applet (System Quirk).
- Uplug Sentence Aligner: bitext alignment limited to 1,000 lines.
Phonetics
- Forvo: repository of words pronunciation in their original languages.
Others
- OpenNLP: organizational centre for open source projects related to natural language processing.
- Google sets: generates lists from a small number of examples by using the web as a corpora.
- Semantic Signatures: a way to find information without knowing explicitly what you're looking for but "you know it when you see it."
- Diatopix: tool to see graphically the distribution of words, terms and expressions in a virtual web space delimited by a language (based on Yahoo! search engine).
DESKTOP APPLICATIONS
Text Mining Suites
- GATE by the Natural Language Processing Research Group (University of Sheffield).
- RapidMiner (formerly YALE).
Term extraction
- ExtPhr32 by Tim Craven: "Extracts every word and every phrase up to a certain length that occurs at least a minimum number of times in a source text and that does not start or end with a stop word."
- TexNet32 by Tim Craven: "Assists in the writing of abstracts and other short summaries. Includes word and phrase extraction and various other capabilities."
Concordancers
- ApSIC Xbench: integrated reference tool that lets you search, filter and import and export basically any resource file format, including tabbed text files, .csv, Trados Workbench memories, Multiterm glossaries and TagEditor files, TMX, Trados Word bilingual uncleaned files, Installed and exported IBM TranslationManager folders, Exported IBM TranslationManager dictionaries, SDLX .itd files, Star Transit projects, Wordfast glossaries and memories, TBX files, XLIFF files and Mac OS X software files. Highly recommended.
- AntConc 3.2.1 by Laurence Anthony.
- Simple Concordance Program by Textworld: lets you create word lists and search natural language text files for words, phrases, and patterns.
- TextSTAT by Free University of Berlin: produces word frequency lists and concordances from text and HTML files.
- KWIC Concordance by Satoru Tsukamoto: corpus analytical tool for making word frequency lists, concordances and collocation tables.
- Colloqator by the University of Surrey (part of System Quirck).
- aConCorde: multi-lingual concordance tool.
- KWiCFinder: key word in context research tool for the web.
Summarizers
- Open Text Summarizer (originally designed for Linux, but can also be used in Windows).
Corpus analysis tools
- NEPHIS32 by Tim Craven: "Incorporates the classic NEPHIS string indexing system, with some additional formatting features, including production of indexes in HTML format."
- XRefHT32 by Tim Craven: "Assists in human indexing of Web documents. Produces index displays in HTML form, including links to Web documents."
- ZOOM by Semantic Knowledge: Semantic Search engine (English, French, Spanish, Portuguese and Brazilian).
- Emdros: text database engine for analyzed or annotated text.
- TATOE: semi-automated text analysis on multiple levels with a user-friendly interface.
- LEXA: set of applications for automated corpus linguistic analysis, including lexical analysis, information retrieval and database management.
- kfNgram: generates lists of n-grams in text and HTML files.
Terminology management
- System Quirck by the University of Surrey: package of integrated tools for building and managing truly multilingual term bases, technical dictionaries and translation resources.
- TheW32 by Tim Craven: "Assists in creating, modifying, and printing out a thesaurus. Allows users to define their own link types and report formats."
- AntWordProfiler 1.01 by Laurence Anthony: word profiling program.
- Dictionary Development Process: set of tools for the development of dictionaries, based on a list of semantic domains used to collect words, classify a dictionary, and facilitate semantic research.
- T-Manager: Excel spreadsheet with macros that generates automatic terminology analysis, customisation and management.
Text analysers
- AntMover 1.0 by Laurence Anthony: text structural analyser.
- PETRA tag (requires NET Framework): performs a variety of advanced operations on the text, increasing the quality and efficiency of the translation and editing processes.
- Sentence and Paragraph Breaker by Scott Piao: Java application which identifies sentences and paragraphs.
- jTokeniser: set of tools to tokenise white spaces, regular expressions, sentences and words.
Others
- CorpusCatcher: corpus collection toolset to build language or topic specific corpora from public web resources.
- IntelliWebSearch by Michael Farrell: search within local and web resources from any Windows application.
- Google Desktop: add the power of Google to your local files (such as translations, parallel texts, glossaries, etc.).
- TBXMaker: converts glossaries stored in CSV (Comma Separated Values) to TBX (TermBase eXchange) format.
- Club Cycom: collection of translation and terminology tools and dictionaries, and educational tools and materials. Note: This is not exactly freeware, but it's just 9 USD/8 Euro per year and includes man advanced features.