Status: 
production + further development
OS: 
Linux, OS X, Windows

Korektor

1. Introduction

Korektor is a statistical spellchecker and (occasional) grammar checker released under 2-Clause BSD license and versioned using Semantic Versioning.

Korektor started with Michal Richter's diploma thesis Advanced Czech Spellchecker, but it is being developed further. There are two versions: a command line utility (tested on Linux, Windows and OS X) and a REST service with publicly available API and HTML front end.

The original OS X SpellServer providing System Service integrating Korektor with native OS X GUI applications is no longer developed, but do not hesitate to contact us if you are interested in it.

Copyright 2015 by Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic.

2. Online

2.1. Korektor Spellchecker Browser Plugin

Korektor Spellchecker is a browser plugin which allows using the Korektor Spellchecker for most editable input fields. The plugin allows either directly correcting the content, or showing a dialog with suggested corrections.

Note that the dialog with suggestions is injected directly into the original page, so there can be various problems on untested sites. However, the plugin seem to work fine on many sites.

Although the sources of Korektor service and this plugin are available under BSD-3-Clause license, please respect the CC BY-SA-NC licence of the spellchecking models.

The plugin is available for the following browers:

2.2. Korektor Online Demo

LINDAT/CLARIN hosts Korektor Online Demo.

2.3. Korektor Web Service

LINDAT/CLARIN also hosts Korektor Web Service.

3. Release

3.1. Download

Korektor releases are available on GitHub, either as a pre-compiled binary package, or source code packages only.

3.1.1. Spellchecker Models

To use Korektor, a spellchecker model is needed. The language models are available from LINDAT/CLARIN infrastructure and described further in the Korektor User's Manual. Currently the following language models are available:

3.1.2. Original Michal Richter's version

The original Michal Richter's version can be downloaded here.

3.2. License

Korektor is an open-source project and is freely available for non-commercial purposes. The library is distributed under 2-Clause BSD license and the associated models and data under CC BY-NC-SA, although for some models the original data used to create the model may impose additional licensing conditions.

If you use this tool for scientific work, please give credit to us by referencing Korektor website and Richter et al. 2012.

4. Installation

Korektor Installation on separate page.

5. User's Manual

Korektor User's Manual on separate page.

6. Model Creation

Korektor Model Creation on separate page.

7. Contact

Current Authors:

Original Author:

Korektor website.

Korektor LINDAT/CLARIN entry.

8. Acknowledgements

This work has been using language resources developed and/or stored and/or distributed by the LINDAT/CLARIN project of the Ministry of Education of the Czech Republic (project LM2010013).

Acknowledgements for individual language models are listed in Korektor User's Manual page.

8.1. Publications

  • (Richter et al. 2012) Richter Michal, Straňák Pavel and Rosen Alexandr. Korektor – A System for Contextual Spell-checking and Diacritics Completion In Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012), pages 1-12, Mumbai, India, 2012.

8.2. Bibtex for Referencing

@InProceedings{richter12,
  booktitle    = {Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012)},
  title        = {Korektor--A System for Contextual Spell-checking and Diacritics Completion},
  editor       = {Martin Kay and Christian Boitet},
  author       = {Michal Richter and Pavel Stra{\v{n}}{\'{a}}k and Alexandr Rosen},
  year         = {2012},
  publisher    = {Coling 2012 Organizing Committee},
  organization = {{IIT} Bombay},
  address      = {Mumbai, India},
  venue        = {{IIT} Bombay, {VMCC}},
  pages        = {1--12}
}

8.3. Persistent Identifier

If you prefer to reference Korektor by a persistent identifier (PID), you can use http://hdl.handle.net/11234/1-1469.

Screenshot: