599 Educators providing Courses

Lexical Computing

lexical computing

0.0(2)

East Sussex

We provide large high-quality word databases, lexical data, word lists and lexicons in many languages. Our data are generated from large databases of authentic text called text corpora. The largest corpora contain texts with a total length of 60,000,000,000 words. Such data allow us to generate databases of millions or even hundreds of millions of items while preserving accuracy and reliability. Our customers are software developers, dictionary and language teaching material publishers and anyone who needs reliable language data. The databases we supply can be enriched with related linguistic data such as synonyms, collocations, example sentences and morphological and statistical information. We also provide solutions in the area of full-text search, terminology extraction, document classification and categorization, data mining and information retrieval. Data samples Word frequency lists: English, Spanish, French, Arabic, Russian, Portuguese, Hindi. Bigram databases: English, Spanish, German, Russian Lexical Computing is a research company founded by Adam Kilgarriff in 2003. It works at the intersection of corpus and computational linguistics and is committed to an empiricist approach to the study of language, in which corpora play a central role: for a very wide range of linguistic questions, if a suitable corpus is available, it will help us understand. The flagship product of Lexical Computing is Sketch Engine, a leading corpus management and corpus query tool used by linguists, lexicographers, translators and publishers worldwide. Its unique feature – the Word Sketch – and its derived functionalities together with the scalability, multilingual support and ability to handle the largest available corpora make Sketch Engine stand out in the crowd of corpus software. Lexical Computing is a supplier of word databases, lexicons, n-gram databases and similar language data for use in other software or for lexicographic projects. Data provided by Sketch Engine and services from Lexical Computing are based on a suite of more than 650 text corpora with a size of up to 60 billion words and covering over 90 languages.