Feature Highlighting

Characterization is a process of dimensionality reduction in which the initial set of raw variables is reduced to more manageable groups (features) for further processing, while remaining a sufficient set for an accurate and complete description of the initial data set ^[1] . Feature extraction is used in machine learning , pattern recognition, and image processing . Characterization begins with the initial data set, displays secondary values ( signs ), for which it is assumed that they should be informative and not redundant, which contributes to the subsequent process of machine learning and generalization of steps, and in some cases leads to a better human interpretation of the data .

When the input of the algorithm is too large to process and there is a suspicion that the data is redundant (for example, measurements were taken in feet or meters, or the repeatability of the images is represented by pixels ), they can be converted into a reduced set of features (called the feature vector ) . The definition of a subset of initial features is called feature selection ^[2] . The selected features are checked for the content of the necessary information in the input data, so that the desired task can be performed using this reduced set instead of the original complete data.

Content

General approach

Characterization involves reducing the number of resources needed to describe a large data set. When analyzing complex data, one of the main problems is caused by the number of variables involved. An analysis with a large number of variables in the general case requires a lot of memory and computational power, and it can also cause re-fitting with respect to the training set for classification algorithms, which generally leads to poor results for new samples. Characterization is the main term for methods of constructing combinations of variables to circumvent these problems, nevertheless describing the data with sufficient accuracy. Many machine learning practices believe that properly optimized feature extraction is the key to building an effective model ^[3] .

Results can be improved using a built-in set of application-specific attributes, usually constructed by experts. One of these processes is called feature design . Alternatively, general dimensional reduction techniques are used, such as:

Independent Component Analysis
Latent semantic analysis
Principal Component Method
Auto encoder

Image Processing

One of the very important areas of the feature extraction application is image processing , which uses algorithms to detect and isolate various desired portions or figures (features) of a digital image or video stream . One of the important areas of application of methods is optical character recognition .

Low Level

Highlighting Borders

Curvature

Rib direction, intensity change, autocorrelation .

Moving Images

Zonal and differential approaches. Optical stream .

Form Based Methods

Algorithm for finding singular points and comparing them SIFT
Hough Transformation
- Direct
- Circles / Ellipses
- Arbitrary figures (generalized Hough transform)
- Work with any parameterizable attributes (class parameters, cluster detection, etc ..)

Flexible methods

Deformable, parameterized shapes
Active circuits (writhing)

Highlighting features in software

Many statistical processing packages provide the ability to extract features and reduce dimensionality. Common numerical processing systems such as MATLAB , Scilab , NumPy, and the R language support some simple feature extraction techniques (such as principal component analysis ) using built-in commands. More specific algorithms are often available as public scripts or third-party development. There are also packages designed for specific machine learning applications specifically for feature extraction. ^[four]

Notes

↑ What is Feature Extraction? (unspecified) . deepai.org .
↑ Alpaydin, 2010 , p. 110.
↑ Reality AI Blog, "Its all about the features", September 2017, https://reality.ai/it-is-all-about-the-features/
↑ see, for example, https://reality.ai/

Literature

Ethem Alpaydin. Introduction to Machine Learning . - London: The MIT Press, 2010 .-- ISBN 978-0-262-01243-0 .

[1] What is Feature Extraction? (unspecified) . deepai.org .

[_3a89356554e151fa-2] Alpaydin, 2010 , p. 110.

[3] Reality AI Blog, "Its all about the features", September 2017, https://reality.ai/it-is-all-about-the-features/

[4] see, for example, https://reality.ai/