Integer , an integer data type ( Eng. Integer ), in computer science - one of the simplest and most common data types in programming languages . Serves for representing integers .
The set of numbers of this type is a finite subset of an infinite set of integers, limited by the maximum and minimum values.
In programming, there are unsigned integers and signed integers. The sign of the number is usually encoded by the most significant bit of the machine word. Traditionally, if the most significant bit is 1, then the number is considered negative only if it is not defined as unsigned.
The number of numbers in a machine image of a set of integers depends on the length of the machine word, usually expressed in bits . For example, with a machine word length of 1 byte (8 bits), the range of representable integers with a sign from −128 to 127. In an unsigned format, the byte representation of the number will be from 0 to 255 (2 8 - 1). If a 32-bit machine word is used , a signed integer will represent values from −2 147 483 648 (-2 31 ) to 2 147 483 647 (2 31 −1); a total of 1 0000 0000 16 (4 294 967 296 10 ) possible values.
The limitation of the length of a machine word, due to the specific hardware implementation of a particular computer, is not an obstacle for their processing of representations of integers that are very long in bits, achieved by complicating software algorithms. A natural limitation is the finiteness of the memory capacity and a reasonable execution time.
Integers and calculations with integers in modern computers are very important (in the vast majority of applications they take up less processor resources than floating point arithmetic). All address arithmetic and array index operations are based on integer operations.
View
In the memory of a typical computer system, an integer is represented as a chain of bits of a fixed (multiple of 8) size. This sequence of zeros and ones is nothing more than a binary notation of a number , since positional binary code is usually used to represent numbers in modern computer technology. The range of integers, as a rule, is determined by the number of bytes in the computer's memory allocated to one variable.
Many programming languages offer a choice between short ( English short ), long ( English long ) and integers of standard length. The length of the standard integer type , as a rule, coincides with the size of the machine word on the target platform . For 16-bit operating systems - this type (int) is 2 bytes and matches the type short int (can be used as short, omitting the word int), for 32-bit operating systems it will be 4 bytes and coincides with a long integer long int (can be used as long, omitting the word int), and in this case it will be 4 bytes. The short integer short int, for 16-bit operating systems, 32-bit operating systems, and for most 64-bit operating systems is 2 bytes. Also, in some languages, the data type is double long, long long, which is 8 bytes.
For 64-bit operating systems, given the difference in data models (LP64, LLP64, ILP64), an integer representation on different data models may differ. The int and long types can be either 4 or 8 bytes.
It should be noted that each programming language implements its signature of the representation of integers, which may differ from international standards, but must support it. For example, the Qt cross-platform library can be attributed, where the whole is represented by qintX and quintX, where X-8,16,32,64.
Integer types are subdivided into unsigned ( English unsigned ) and signed ( English signed ).
Unsigned integers
Unsigned integers represent only non-negative numbers, while all digits of the code are used to represent the value of the number and the maximum number corresponds to the unit values of the code in all digits: 111 ... 111. An unsigned m-byte variable of integer type obviously takes values from 0 to +2 8m −1.
In C and C ++ , unsigned is used to indicate unsigned types. In C # , the prefix u ( unsigned ) is used as an indicator of unsignedness. For example, to declare an unsigned integer equal in size to one machine word , the uint type is used.
Unsigned integers, in particular, are used for addressing memory , representing characters .
Sometimes in the literature [1] there are recommendations not to use the unsigned integer type, since it may not be implemented by a computer processor , but this advice is unlikely to be relevant - most modern processors (including x86- compatible [2] ) work equally well as with signed and unsigned integers.
In some languages, such as Java , unsigned integer types (except for the character) are absent [3] .
Misuse of unsigned integers can lead to non-obvious errors due to overflow [4] . In the example below, using unsigned integers in a loop in C and C ++ turns this loop into an infinite:
char ar [ N ];
for ( unsigned int i = N - 1 ; i > = 0 ; - i ) {
ar [ i ] = i ;
}
Signed integers
There are several different ways to represent integer values in binary code as a signed quantity in . In particular, direct and reverse codes can be called. The sign is encoded in the high order of the number: 0 corresponds to positive, and 1 corresponds to negative numbers.
More exotic representations of negative numbers can also be used, such as, for example, the base number system -2. [five]
However, for most modern processors, the usual representation of signed numbers is additional code . The maximum positive number is represented by the binary code 0111 ... 111, the maximum modulus is negative with the code 1000 ... 000, and the code 111 ... 111 corresponds to −1. Such a representation of numbers corresponds to the simplest implementation of arithmetic logic devices of the processor on logical gates and allows using the same addition and subtraction algorithm for unsigned numbers and for signed numbers (the difference is only in conditions under which it is believed that arithmetic overflow ).
The signed m-byte variable of integer type takes values from −2 8m-1 to +2 8m-1 −1.
Limit values for different bit
Below is a table of limit values for decimal numbers for different digits when encoding with additional code . In the column "Maximum Decimal", first goes the maximum value of the unsigned integer, and below it the minimum and maximum integer with a sign.
| Bits | Bytes | Dv. words | Maximum decimal | Des numbers |
|---|---|---|---|---|
| four | ½ | ⅛ | 15 −8 +7 | 2 one one |
| eight | one | ¼ | 255 −128 +127 | 3 3 3 |
| sixteen | 2 | ½ | 65535 −32768 +32767 | five five five |
| 24 | 3 | ¾ | 16777215 −8388608 +8388607 | eight 7 7 |
| 32 | four | one | 4294967295 −2147483648 +2147483647 | ten ten ten |
| 48 | 6 | 1½ | 281474976710655 −140737488355328 +140737488355327 | 15 15 15 |
| 64 | eight | 2 | 18446744073709551615 −9223372036854775808 +9223372036854775807 | 20 nineteen nineteen |
| 96 | 12 | 3 | 79228162514264337593543950335 −39614081257132168796771975168 +39614081257132168796771975167 | 29th 29th 29th |
| 128 | sixteen | four | 340282366920938463463374607431768211455 −170141183460469231731687303715884105728 +170141183460469231731687303715884105727 | 39 39 39 |
| 256 | 32 | eight | 115792089237316195 (...) 584007913129639935 −57896044618658097 (...) 792003956564819968 +57896044618658097 (...) 792003956564819967 | 78 77 77 |
| 512 | 64 | sixteen | 13407807929942597099 (...) 946433649006084095 −6703903964971298549 (...) 973216824503042048 +6703903964971298549 (...) 973216824503042047 | 155 154 154 |
| 1024 | 128 | 32 | 179769313486231590 (...) 356329624224137215 −89884656743115795 (...) 678164812112068608 +89884656743115795 (...) 678164812112068607 | 309 308 308 |
| 2048 | 256 | 64 | 32317006071311007 (...) 853611059596230655 −16158503035655503 (...) 926805529798115328 +16158503035655503 (...) 926805529798115327 | 617 617 617 |
| 4096 | 512 | 128 | 1044388881413152506 (...) 708340403154190335 −522194440706576253 (...) 354170201577095168 +522194440706576253 (...) 354170201577095167 | 1234 1233 1233 |
| 8192 | 1024 | 256 | 1090748135619415929 (...) 505665475715792895 −545374067809707964 (...) 252832737857896448 +545374067809707964 (...) 252832737857896447 | 2467 2466 2466 |
| 16384 | 2048 | 512 | 1189731495357231765 (...) 027290669964066815 −594865747678615882 (...) 513645334982033408 +594865747678615882 (...) 513645334982033407 | 4933 4932 4932 |
| 32768 | 4096 | 1024 | 1415461031044954789 (...) 668104633712377855 −707730515522477394 (...) 334052316856188928 +707730515522477394 (...) 334052316856188927 | 9865 9864 9864 |
| 65536 | 8192 | 2048 | 2003529930406846464 (...) 587895905719156735 −1001764965203423232 (...) 793947952859578368 +1001764965203423232 (...) 793947952859578367 | 19729 19729 19729 |
| 131072 | 16384 | 4096 | 4014132182036063039 (...) 812318570934173695 −2007066091018031519 (...) 906159285467086848 +2007066091018031519 (...) 906159285467086847 | 39457 39457 39457 |
| 262144 | 32768 | 8192 | 16113257174857604736 (...) 605349934298300415 −8056628587428802368 (...) 302674967149150208 +8056628587428802368 (...) 302674967149150207 | 78914 78913 78913 |
| 524288 | 65536 | 16384 | 259637056783100077 (...) 364528226185773055 −129818528391550038 (...) 182264113092886528 +129818528391550038 (...) 182264113092886527 | 157827 157827 157827 |
| 1048576 | 131071 | 32767 | 67411401254990734 (...) 119068940335579135 −33705700627495367 (...) 559534470167789568 +33705700627495367 (...) 559534470167789567 | 315653 315653 315653 |
Operations on Integers
Arithmetic operations
Arithmetic operations are primarily applicable to integer values. Below are the most frequently used (in parentheses are their designations in various programming languages and similar tools).
- Comparison Here the ratios “equal” (“
=”; “==”; “eq”), “not equal” (“!=”; “<>”; “ne”), “more” (“>”; “gt")," greater than or equal to "(">=";"ge")," less than "("<";"lt") and" less than or equal to "("<=";"le"). - Increment ( eng. Inc rement ; “
++”) and decrement ( eng. Dec rement ; “--”) - an arithmetic increase or decrease in the number by one. Separated into separate operations due to frequent use with counter variables in programming. - Addition ( Eng. Add ition ; "
+") and Subtraction ( Eng. Sub traction ; "-"). - Multiplication ( English mul tiplication ; "
*"). - Division ( eng. Div ision ; “
/”; “\”) and the remainder of the division ( eng. Mod ulo ; “%”). Some processors (for example, x86 architectures) allow you to perform both of these operations in one instruction. - Sign inversion ( English neg ation ) and obtaining the absolute value ( English abs olute ).
- Getting a mark . The result of such an operation is usually 1 for positive values, −1 for negative and 0 for zero.
- Extinction (
«^»).
In some programming languages, for brevity, there are operators that allow you to perform an arithmetic operation with an assignment. For example, “ += ” adds the current value of the variable on the left with the expression on the right and puts the result in the original variable. Also, in some languages and environments, the combined operation MulDiv is available , which multiplies by one number, and then divides the result into a second.
Usually the most expensive operations in terms of speed are multiplication and division (getting the remainder of division).
In the computer's memory, cells of a fixed volume are usually allocated for storing integers. Because of this, increasing and decreasing operations can lead to overflow, which results in a distortion of the result. Some programming languages allow throwing exceptions in such cases. In addition, you can define overflow behavior:
- Loop operation (usually occurs by default). For example, if you increment an 8-bit unsigned value of 255, you get 0.
- Saturation operation. If the limit is reached, then the final value will be the limit. For example, if you add 10 to an 8-bit unsigned number 250, you get 255. Addition, subtraction, and saturation multiplication are usually used when working with color.
Bitwise operations
In addition to mathematical ones, bit operations that are based on the peculiarities of positional binary coding are applicable to integers. Usually they are performed much faster than arithmetic operations and therefore they are used as more optimal analogues.
- A bit shift to the left with zero padding is similar to multiplying a number by a power of two (the number of shift bits corresponds to a power of two).
- A right shift bit is similar to dividing by a power of two (the number of shift bits corresponds to a power of two). Some programming languages and processors support arithmetic shift, which allows you to save the sign in integers with a sign (the value of the high bit is stored).
- In integers with a sign, the sign can be recognized by the most significant bit (in the negative, it is set).
- Reading and setting the low-order bit allows you to control the parity (it is set for odd numbers).
- The bitwise “AND” over a certain number of least significant bits allows you to find out the remainder of the division by the power of two (the degree corresponds to the number of bits).
- The bitwise "OR" over a certain number of least significant bits and the subsequent increment rounds the number by a value equal to the power of two (the degree corresponds to the number of bits) - is used to align addresses and sizes by a certain value.
Work with strings
Quite frequent operations are to get a string from a numeric value in the internal representation and vice versa - a number from a string. When converting to a string, formatting tools are usually available depending on the user's language.
Listed below are some of the number representations in a row.
- Decimal number When you receive a string, you can usually set the separators of the digits, the number of characters (leading zeros are added, if there are fewer ones) and the obligatory indication of the sign of the number.
- A number in the number system, which is a power of two. The most common: binary (binary English binary ), octal ( English octal ) and hexadecimal ( English hexadecimal ). When receiving a string, you can usually specify the separators of groups of numbers and the minimum number of digits (zeros are added if there are fewer ones). Since these representations are most often used in programming, the corresponding options are usually available here. For example, specifying a prefix and postfix to get a value in accordance with the language syntax. For hexadecimal it is important to indicate the case of characters, as well as the mandatory addition of zero if the first digit is represented by a letter (so that the number is not defined as a string identifier).
- Roman number ( roman ).
- Verbal representation (including the amount in words ) - the number is represented by words in the indicated natural language.
Enumerated type
An integer is also an integer. Enumerated type variables accept a finite predefined set of values. The size of the set is not determined by the number of bytes used to represent the integer values of variables of this type.
For example, in Python, the boolean type is a subtype of the integer and uses the names False and True, which, when cast to the integer, get the values 0 and 1, respectively [6] .
See also
- Direct code
- Reverse code
- Additional code
- Arithmetic of arbitrary precision
Notes
- ↑ Ben-Ari, 2000 , p. 54.
- ↑ Lesson 7. Advanced arithmetic operations with integers , Low-level programming, SSTC NSU
- ↑ Types, Values and Variables , Java Languaege Specification, 2-nd ed.
- ↑ “Are Unsigned Integers Necessary?” (January 22, 2013)
- ↑ Hacker's Delight, 2004 , p. 215-221.
- ↑ Beazley, 2009 , pp. 38.
Literature
- Basic definitions of digital and microprocessor technology , Distance Learning System, St. Petersburg State University ITMO , Software for measuring systems based on universal computers
- M. Ben-Ari. Chapter 4. Elementary data types // Programming languages. Practical Benchmarking = Understanding Programming Language. - Moscow: Mir, 2000 .-- S. 53-74. - 366 p. - ISBN 5-03-003314-9 .
- Terence Pratt, Marvin Zelkowitz. 5.2. Scalar data types // Programming languages. Design and implementation = Programming Language. Design and Implementation. - 4-th edition. - Peter, 2002 .-- S. 205-216. - 688 p. - ISBN 5-03-003314-9 .
- Henry Warren Jr. Algorithmic tricks for programmers = Hacker's Delight. - Williams , 2004 .-- 288 p. - ISBN 5-8459-0572-9 .
- Behrooz Parhami. Computer Arithmetic: Algorithms and Hardware Designs . - New York: Oxford University Press, 2000 .-- 510 p. - ISBN 0-19-512583-5 .
- David M. Beazley. Python Essential Reference. - 4th Edition. - Addison-Wesley Professional, 2009 .-- 717 p. - ISBN 978-0672329784 .