Clever Geek Handbook
πŸ“œ ⬆️ ⬇️

Feature Highlighting

Characterization is a process of dimensionality reduction in which the initial set of raw variables is reduced to more manageable groups (features) for further processing, while remaining a sufficient set for an accurate and complete description of the initial data set [1] . Feature extraction is used in machine learning , pattern recognition, and image processing . Characterization begins with the initial data set, displays secondary values ​​( signs ), for which it is assumed that they should be informative and not redundant, which contributes to the subsequent process of machine learning and generalization of steps, and in some cases leads to a better human interpretation of the data .

When the input of the algorithm is too large to process and there is a suspicion that the data is redundant (for example, measurements were taken in feet or meters, or the repeatability of the images is represented by pixels ), they can be converted into a reduced set of features (called the feature vector ) . The definition of a subset of initial features is called feature selection [2] . The selected features are checked for the content of the necessary information in the input data, so that the desired task can be performed using this reduced set instead of the original complete data.

Content

General approach

Characterization involves reducing the number of resources needed to describe a large data set. When analyzing complex data, one of the main problems is caused by the number of variables involved. An analysis with a large number of variables in the general case requires a lot of memory and computational power, and it can also cause re-fitting with respect to the training set for classification algorithms, which generally leads to poor results for new samples. Characterization is the main term for methods of constructing combinations of variables to circumvent these problems, nevertheless describing the data with sufficient accuracy. Many machine learning practices believe that properly optimized feature extraction is the key to building an effective model [3] .

Results can be improved using a built-in set of application-specific attributes, usually constructed by experts. One of these processes is called feature design . Alternatively, general dimensional reduction techniques are used, such as:

  • Independent Component Analysis
  • Latent semantic analysis
  • Principal Component Method
  • Auto encoder

Image Processing

One of the very important areas of the feature extraction application is image processing , which uses algorithms to detect and isolate various desired portions or figures (features) of a digital image or video stream . One of the important areas of application of methods is optical character recognition .

Low Level

  • Highlighting Borders

Curvature

  • Rib direction, intensity change, autocorrelation .

Moving Images

  • Zonal and differential approaches. Optical stream .

Form Based Methods

  • Algorithm for finding singular points and comparing them SIFT
  • Hough Transformation
    • Direct
    • Circles / Ellipses
    • Arbitrary figures (generalized Hough transform)
    • Work with any parameterizable attributes (class parameters, cluster detection, etc ..)

Flexible methods

  • Deformable, parameterized shapes
  • Active circuits (writhing)

Highlighting features in software

Many statistical processing packages provide the ability to extract features and reduce dimensionality. Common numerical processing systems such as MATLAB , Scilab , NumPy, and the R language support some simple feature extraction techniques (such as principal component analysis ) using built-in commands. More specific algorithms are often available as public scripts or third-party development. There are also packages designed for specific machine learning applications specifically for feature extraction. [four]

See also

  • Cluster analysis
  • Dimensional reduction
  • Feature Selection
  • Data mining
  • Segmentation (image processing)

Notes

  1. ↑ What is Feature Extraction? (unspecified) . deepai.org .
  2. ↑ Alpaydin, 2010 , p. 110.
  3. ↑ Reality AI Blog, "Its all about the features", September 2017, https://reality.ai/it-is-all-about-the-features/
  4. ↑ see, for example, https://reality.ai/

Literature

  • Ethem Alpaydin. Introduction to Machine Learning . - London: The MIT Press, 2010 .-- ISBN 978-0-262-01243-0 .
Source - https://ru.wikipedia.org/w/index.php?title=Training_of_characters&oldid=101393927


More articles:

  • Wet Orlovka
  • French Panties
  • Referenda in Switzerland (1986)
  • Changes in the definitions of the basic units of SI (2018)
  • Javadov, Eduard Javadovich
  • Yard
  • The model of the "life cycle" of international norms
  • Rudno (Leningrad Region)
  • Abbagnale, Giuseppe
  • So Yoshimichi

All articles

Clever Geek | 2019