CuneiForm ( English cuneiform , cuneiform - cuneiform writing ), Cognitive OpenOCR - a freely distributed open system for optical text recognition of the Russian company Cognitive Technologies .
Cuneiform | |
---|---|
Type of | optical character recognition |
Author | Cognitive technologies |
Developers | CuneiForm-Linux team |
Written on | C , C ++ |
Interface | command line |
operating system | Linux , Mac OS X, and others. UNIX-like |
First edition | 1996 |
Latest version | 1.1.0 (April 19, 2011 [1] ) |
Generated File Formats | |
condition | active |
License | BSD- like license from Cognitive Technologies [2] |
Website | launchpad.net/cuneiform-... |
OCR CuneiForm was developed by Cognitive Technologies as a commercial product in 1993. The system was delivered with the most popular scanner models, MFPs and software in Russia and the world: Corel Draw, Hewlet-Pachard, Epson, Xerox, Samsung, Brother, Mustek, OKI, Canon, Olivetti, etc. In 2008, Cognitive Technologies opened the source code for OCR CuneiForm.
Content
- 1 Features
- 2 History
- 3 Graphical Interfaces for CuneiForm
- 4 See also
- 5 notes
- 6 References
Features
CuneiForm is positioned as a system for converting electronic copies of paper documents and graphic files into an editable form with the ability to save the structure and font type of the original document in automatic or semi-automatic mode. The system includes two programs for single and batch processing of electronic documents.
List of languages supported by the system:
- English
- Bulgarian
- Hungarian
- Dutch
- Danish
- Spanish
- Italian
- Latvian
- Lithuanian
- Deutsch
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovenian
- Turkish
- Ukrainian
- French
- Croatian
- Czech
- Swedish
- Estonian
In addition, a mixture of Russian and English is supported. Recognition of mixtures of other languages is supported only in the branch developed by Andrei Borovsky in 2009 [3] . Learning other languages is difficult due to the connection of each language with a dat-file, the structure and method of obtaining which developers did not disclose.
History
1993 - Cognitive Technologies entered into an OEM contract with the Canadian corporation Corel Corporation , according to which the Cognitive recognition library is integrated into the popular publishing package Corel Draw 3.0 (and subsequent versions). [four]
1994 - A contract was signed with Hewlett-Packard to complete the scanners supplied to Russia by OCR CuneiForm. This is the first HP contract with a Russian software developer. [5] [6] [7] [8]
1995 - A contract was signed with the Japanese corporation Epson on the assembly of OCR CuneiForm scanners. [9] An OEM contract has been signed with the world's largest manufacturer of fax machines, laser printers, scanners and other office equipment - Brother Corporation. Under the agreement, the new Brother IC-150 roller scanner will be equipped with Cognitive software for scanning, recognition around the world.
1996 - An OEM contract was signed with one of the world's largest manufacturers of monitors, fax machines, laser printers, multifunction devices and other office equipment - Samsung Information Systems America. [10] According to the agreement, the new Samsung OFFICE MASTER OML-8630A multifunction device will be sold complete with the Cognitive Cuneiform LE optical character recognition system worldwide.
- An OEM contract was signed with Xerox, a leading global manufacturer of office equipment, on the integration of Xerox 3006 and Pro-610 multifunction devices with a CuneiForm recognition system.
- The OCR CuneiForm '96 version has been released, which uses adaptive recognition algorithms for the first time in the world.
Adaptive recognition is a method based on a combination of two types of printed character recognition algorithms: font (multifont) and font-independent (omnifont). The system generates an internal font for each input document, based on well-typed characters, that is, it uses dynamic tuning (adaptation) to specific input characters. Thus, the method combines the versatility and manufacturability of a font-free approach and high accuracy of font recognition, which allows to radically improve the quality of recognition.
1997 - For the first time, CuneiForm system uses technologies based on neural networks. Algorithms using neural networks for character recognition are constructed as follows. The symbol image (raster) arriving for recognition is reduced to a certain standard size (normalized). The brightness values in the nodes of the normalized raster are used as input parameters of the neural network. The number of output parameters of the neural network is equal to the number of recognized characters. The recognition result is a symbol that corresponds to the largest of the values of the output vector of the neural network.
- A new OEM contract was signed with Canon for the assembly of multifunction devices delivered to Russia with the CuneiForm system; [11] [12]
- A new OEM contract was signed with OKI Europe Limited for the assembly of OKI FAX 4100 and OKI FAX 5200 multifunction devices, delivered to Russia, with the CuneiForm system; [13]
- The release of the world's first CuneiForm MMX Update OCR system for the Intel-MMX processor; [fourteen]
- NeuHaus scanners are equipped with a CuneiForm recognition system;
- The release of the first CuneiForm 98 NeST network scan system in Russia.
1999 year
- A new OEM contract was signed with Olivetti for the assembly of multifunction devices supplied to Russia by the CuneiForm system;
- A distribution agreement has been concluded with WSKA, a leading European software distributor (France), for the distribution of OCR Cuneiform Direct in Europe;
- A new version of the CuneiForm 2000 recognition system has been released which implements the Cognitive AnalysisTM method: an expert system is built into the recognition engine that allows you to analyze the estimates of alternatives received at the output of each recognition algorithm and select the best option.
- The method of “Meridian segmentation of tables” was developed to increase the accuracy of the recreation of the original form of the table in the output document;
- A mechanism has been developed to recreate the form of the source document “What you scan is what you get”. The fact is that until now, users of recognition systems have spent a sufficient amount of time (which often exceeded the time of scanning and recognition itself) to give the document its original form, ensuring that each fragment is in the right place. This is especially true for documents with a complex topology: multi-columned texts with headings, annotations, graphic illustrations, tables, etc.
2001 - An OEM contract was signed on the assembly of equipment manufactured by Canon (scanners, multifunction devices), Cognitive Technologies software (OCR CuneiForm) in Eastern Europe.
On December 12, 2007, a freeware version of OCR CuneiForm was released and the opening of its sources was announced [15] . [16] [17]
On April 2, 2008, the OCR Cuneiform source code was published under the BSD license [18] , and in the fall, the system interface source code [19] was published.
The latest version of the OpenSource version for Windows has not been updated since February 2009.
In 2009, graphical interfaces were released for the open version of Cuneiform based on the Qt 4 library - Cuneiform-Qt [20] , YAGF . Starting with version 0.9.0 [21], the open version for Linux can be used as a library .
Graphical Interfaces for CuneiForm
- Yagf
- OCRFeeder
- KBookOCR
- Cuneiform-qt
- Puma.NET (English) - .NET interface library
- Quneiform
See also
- Cognitive forms
Notes
- ↑ Cuneiform port for Linux.
- ↑ License text. Archived on March 19, 2012.
- ↑ ~ anb-symmetrica / cuneiform-linux / cuneiform-multilang: revision 400
- ↑ Made in Russia (inaccessible link) . Date of treatment December 6, 2016. Archived June 27, 2013.
- ↑ "Legitimate" HP scanners will learn to read in Russian , Kommersant, 09/01/1994
- ↑ HP Scanners with the CuneiForm flag
- ↑ Hewlett-Packard presents new products in Moscow , Algonet
- ↑ HP and Cognitive deepen partnership , PCweek
- ↑ PC world. Pearls of Russian software
- ↑ Samsung chose Cognitive Technology's OCR for multi-functional devices , PCweek, 09/10/1996
- ↑ Vladimir Mitin Canon multifunctional devices gained a new quality , PCweek, 05/12/1998
- ↑ Vladimir Mitin Canon Alliance - Cognitive Technologies Develops Successfully , PCweek, 02.29.2000
- ↑ Cognitive Technologies and OKI press gas , PCweek
- ↑ HP Informal Marketing , Computer World
- ↑ OCR CuneiForm - the first industrial Open Source recognition project Archived on May 25, 2011.
- ↑ Russian text recognition system CuneiForm will become open Cybersecurity, 12/06/2007
- ↑ CuneiForm is back , Computerworld, 12.24.2007
- ↑ Cognitive Technologies has opened the Cuneiform OCR code (inaccessible link) . Date of treatment April 4, 2011. Archived November 11, 2009.
- ↑ Cognitive Technologies unveiled the Cuneiform OCR code Archived on April 19, 2011.
- ↑ Cuneiform-Qt
- ↑ Cuneiform Linux 0.9.0 is released
Links
- Cuneiform port for Linux.
- The site of the CuneiForm- based OpenOCR project.
- Konstantin Vasiliev Russia starts and wins , Computer press
- Selected the best free projects in Russia , Open Systems, 12/22/2009
- Alexander Malyarevsky On the Russian market - Russian products , PCworld
- Alexey Chernobrovtsev OCR of the next century , Compure World (Russia)
- Andrey Ezdakov And in Russia they will produce software , PCweek, 09/26/1995
- Cognitive OpenOCR version 11 for Windows