[ Team LiB ] Previous Section Next Section

3.1 Types

C# is a strongly typed language. In a strongly typed language you must declare the type of each object you create (e.g., integers, floats, strings, windows, buttons, etc.), and the compiler will help you prevent bugs by enforcing that only data of the right type is assigned to those objects. The type of an object signals to the compiler the size of that object (e.g., int indicates an object of 4 bytes) and its capabilities (e.g., buttons can be drawn, pressed, and so forth).

Like C++ and Java, C# divides types into two sets: intrinsic (built-in) types that the language offers and user-defined types that the programmer defines.

C# also divides the set of types into two other categories: value types and reference types.[1] The principal difference between value and reference types is the manner in which their values are stored in memory. A value type holds its actual value in memory allocated on the stack (or it is allocated as part of a larger reference type object). The address of a reference type variable sits on the stack, but the actual object is stored on the heap.

[1] All the intrinsic types are value types except for Object (discussed in Chapter 5) and String (discussed in Chapter 10). All user-defined types are reference types except for structs (discussed in Chapter 7).

C and C++ programmers take note: In C#, there is no explicit indication that an object is a reference type (i.e., no use of the & operator). Also, pointers are not normally used (but see Chapter 22 for the exception to this rule).

In C#, the size and format of the storage for different types is platform independent and consistent across all .NET languages.

If you have a very large object, putting it on the heap has many advantages. Chapter 4 discusses the various advantages and disadvantages of working with reference types; the current chapter focuses on the intrinsic value types available in C#.

C# also supports C++ style pointer types, but these are rarely used, and only when working with unmanaged code. Unmanaged code is created outside of the .NET platform, such as COM objects. Working with COM objects is discussed in Chapter 22.

3.1.1 Working with Built-in Types

The C# language offers the usual cornucopia of intrinsic (built-in) types one expects in a modern language, each of which maps to an underlying type supported by the .NET Common Language Specification (CLS). Mapping the C# primitive types to the underlying .NET type ensures that objects created in C# can be used interchangeably with objects created in any other language compliant with the .NET CLS, such as VB.NET.

Java programmers take note: C# has a broader range of basic types than Java. The C# decimal type is notable; it is quite useful for financial calculations.

Each type has a specific and unchanging size. Unlike with C++, a C# int is always 4 bytes because it maps to an Int32 in the .NET CLS. Table 3-1 lists the built-in value types offered by C#.

Table 3-1. C# built-in value types

Type

Size (in bytes)

.NET type

Description

byte

1

Byte

Unsigned (values 0-255).

char

2

Char

Unicode characters.

bool

1

Boolean

true or false.

sbyte

1

SByte

Signed (values -128 to 127).

short

2

Int16

Signed (short) (values -32,768 to 32,767).

ushort

2

UInt16

Unsigned (short) (values 0 to 65,535).

int

4

Int32

Signed integer values between -2,147,483,648 and 2,147,483,647.

uint

4

UInt32

Unsigned integer values between 0 and 4,294,967,295.

float

4

Single

Floating point number. Holds the values from approximately +/-1.5 * 10-45 to approximate +/-3.4 * 1038 with 7 significant figures.

double

8

Double

Double-precision floating point. Holds the values from approximately +/-5.0 * 10-324 to approximate +/-1.8 * 10308 with 15-16 significant figures.

decimal

16

Decimal

Fixed-precision up to 28 digits and the position of the decimal point. This is typically used in financial calculations. Requires the suffix "m" or "M."

long

8

Int64

Signed integers from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

ulong

8

UInt64

Unsigned integers ranging from 0 to 0xffffffffffffffff.

C and C++ programmers take note: The C# type for true/false values is Boolean, not bool. In C#, Boolean variables can only have the values true or false. Integer values do not equate to Boolean values in C# and there is no implicit conversion.

In addition to these primitive types, C# has two other value types: enum (considered later in this chapter) and struct (see Chapter 4). Chapter 4 also discusses other subtleties of value types, such as forcing value types to act as reference types through a process known as boxing, and that value types do not "inherit."

The Stack and the Heap

A stack is a data structure used to store items on a last-in first-out basis (like a stack of dishes at the buffet line in a restaurant). The stack refers to an area of memory supported by the processor, on which the local variables are stored.

In C#, value types (e.g., integers) are allocated on the stack—an area of memory is set aside for their value, and this area is referred to by the name of the variable.

Reference types (e.g., objects) are allocated on the heap. When an object is allocated on the heap, its address is returned, and that address is assigned to a reference.

The garbage collector destroys objects on the stack sometime after the stack frame they are declared within ends. Typically a stack frame is defined by a function. Thus, if you declare a local variable within a function (as explained later in this chapter), the object will be marked for garbage collection after the function ends.

Objects on the heap are garbage collected sometime after the final reference to them is destroyed.

C and C++ programmers take note: C# manages all memory with a garbage collection system—there is no delete operator.

3.1.1.1 Choosing a built-in type

Typically you decide which size integer to use (short, int, or long) based on the magnitude of the value you want to store. For example, a ushort can only hold values from 0 through 65,535, while a uint can hold values from 0 through 4,294,967,295.

That said, memory is fairly cheap, and programmer time is increasingly expensive; most of the time you'll simply declare your variables to be of type int, unless there is a good reason to do otherwise.

The signed types are the numeric types of choice of most programmers unless the programmer has a good reason to use an unsigned value.

Although you might be tempted to use an unsigned short to double the positive values of a signed short (moving the maximum positive value from 32,767 up to 65,535), it is easier and preferable to use a signed integer (with a maximum value of 2,147,483,647).

It is better to use an unsigned variable when the fact that the value must be positive is an inherent characteristic of the data. For example, if you had a variable to hold a person's age, you would use an unsigned int because an age cannot be negative.

Float, double, and decimal offer varying degrees of size and precision. For most small fractional numbers, float is fine. Note that the compiler assumes that any number with a decimal point is a double unless you tell it otherwise. To assign a literal float, follow the number with the letter f. (Assigning values to literals is discussed in detail later in this chapter.)

float someFloat = 57f;

The char type represents a Unicode character. char literals can be simple, Unicode, or escape characters enclosed by single quote marks. For example, A is a simple character while \u0041 is a Unicode character. Escape characters are special two-character tokens in which the first character is a backslash. For example, \t is a horizontal tab. The common escape characters are shown in Table 3-2.

Table 3-2. Common escape characters

Char

Meaning

\'

Single quote

\"

Double quote

\\

Backslash

\0 

Null

\a

Alert

\b

Backspace

\f

Form feed

\n

Newline

\r

Carriage return

\t

Horizontal tab

\v

Vertical tab

3.1.1.2 Converting built-in types

Objects of one type can be converted into objects of another type either implicitly or explicitly. Implicit conversions happen automatically; the compiler takes care of it for you. Explicit conversions happen when you "cast" a value to a different type. The semantics of an explicit conversion are "Hey! Compiler! I know what I'm doing." This is sometimes called "hitting it with the big hammer" and can be very useful or very painful, depending on whether your thumb is in the way of the nail.

VB6 programmers take note: In VB6 you can easily mix strings and the character data type; a character is treated as a string with a length of one. But C# is type safe. In order to assign a literal character to a char variable, you must surround it with single quotes.

Note also that the VB6 functions to convert between a character and its ASCII equivalent (Chr( ) and Asc( )) don't exist in C#. Instead, use the following functions. In C#, to convert a char to its ASCII equivalent, cast it as an int (integer):

(int)'A'

To convert a number to a char, cast the number as a char:

(char)65

Implicit conversions happen automatically and are guaranteed not to lose information. For example, you can implicitly cast from a short int (2 bytes) to an int (4 bytes). No matter what value is in the short, it is not lost when converting to an int:

short x = 5;
int y = x; // implicit conversion

If you convert the other way, however, you certainly can lose information. If the value in the int is greater than 32,767, it will be truncated in the conversion. The compiler will not perform an implicit conversion from int to short:

short x;
int y = 500;
x = y;  // won't compile

You must explicitly convert using the cast operator:

short x;
int y = 500;
x = (short) y;  // OK

All of the intrinsic types define their own conversion rules. At times it is convenient to define conversion rules for your user-defined types, as discussed in Chapter 5.

    [ Team LiB ] Previous Section Next Section