# Data types

A data type is a collection of related values. These collections need not be disjoint, and they are often hierarchical. Scheme has a rich set of data types: some are simple (indivisible) data types and others are compound data types made by combining other data types.

## 2.1 Simple data types

The simple data types of Scheme include booleans, numbers, characters, and symbols.

### 2.1.1 Booleans

Scheme's booleans are `#t` for true and `#f` for false. Scheme has a predicate procedure called `boolean?` that checks if its argument is boolean.

```(boolean? #t)              =>  #t
(boolean? "Hello, World!") =>  #f
```

The procedure `not` negates its argument, considered as a boolean.

```(not #f)              =>  #t
(not #t)              =>  #f
(not "Hello, World!") =>  #f
```

The last expression illustrates a Scheme convenience: In a context that requires a boolean, Scheme will treat any value that is not `#f` as a true value.

### 2.1.2 Numbers

Scheme numbers can be integers (eg, `42`), rationals (`22/7`), reals (`3.1416`), or complex (`2+3i`). An integer is a rational is a real is a complex number is a number. Predicates exist for testing the various kinds of numberness:

```(number? 42)       =>  #t
(number? #t)       =>  #f
(complex? 2+3i)    =>  #t
(real? 2+3i)       =>  #f
(real? 3.1416)     =>  #t
(real? 22/7)       =>  #t
(real? 42)         =>  #t
(rational? 2+3i)   =>  #f
(rational? 3.1416) =>  #t
(rational? 22/7)   =>  #t
(integer? 22/7)    =>  #f
(integer? 42)      =>  #t
```

Scheme integers need not be specified in decimal (base 10) format. They can be specified in binary by prefixing the numeral with `#b`. Thus `#b1100` is the number twelve. The octal prefix is `#o` and the hex prefix is `#x`. (The optional decimal prefix is `#d`.)

Numbers can tested for equality using the general-purpose equality predicate `eqv?`.

```(eqv? 42 42)   =>  #t
(eqv? 42 #f)   =>  #f
(eqv? 42 42.0) =>  #f
```

However, if you know that the arguments to be compared are numbers, the special number-equality predicate `=` is more apt.

```(= 42 42)   =>  #t
(= 42 #f)   -->ERROR!!!
(= 42 42.0) =>  #t
```

Other number comparisons allowed are `<`, `<=`, `>`, `>=`.

```(< 3 2)    =>  #f
(>= 4.5 3) =>  #t
```

Arithmetic procedures `+`, `-`, `*`, `/`, `expt` have the expected behavior:

```(+ 1 2 3)    =>  6
(- 5.3 2)    =>  3.3
(- 5 2 1)    =>  2
(* 1 2 3)    =>  6
(/ 6 3)      =>  2
(/ 22 7)     =>  22/7
(expt 2 3)   =>  8
(expt 4 1/2) =>  2.0
```

For a single argument, `-` and `/` return the negation and the reciprocal respectively:

```(- 4) =>  -4
(/ 4) =>  1/4
```

The procedures `max` and `min` return the maximum and minimum respectively of the number arguments supplied to them. Any number of arguments can be so supplied.

```(max 1 3 4 2 3) =>  4
(min 1 3 4 2 3) =>  1
```

The procedure `abs` returns the absolute value of its argument.

```(abs  3) =>  3
(abs -4) =>  4
```

This is just the tip of the iceberg. Scheme provides a large and comprehensive suite of arithmetic and trigonometric procedures. For instance, `atan`, `exp`, and `sqrt` respectively return the arctangent, natural antilogarithm, and square root of their argument. Consult R5RS  for more details.

### 2.1.3 Characters

Scheme character data are represented by prefixing the character with `#`. Thus, `#c` is the character `c`. Some non-graphic characters have more descriptive names, eg, `#newline`, `#tab`. The character for space can be written `# ` , or more readably, `#space`.

The character predicate is `char?`:

```(char? #c) =>  #t
(char? 1)   =>  #f
(char? #;) =>  #t
```

Note that a semicolon character datum does not trigger a comment.

The character data type has its set of comparison predicates: `char=?`, `char`, `char<=?`, `char>?`, `char>=?`.

```(char=? #a #a)  =>  #t
(char #a #b)  =>  #t
(char>=? #a #b) =>  #f
```

To make the comparisons case-insensitive, use `char-ci` instead of `char` in the procedure name:

```(char-ci=? #a #A) =>  #t
(char-ci #a #B) =>  #t
```

The case conversion procedures are `char-downcase` and `char-upcase`:

```(char-downcase #A) =>  #a
(char-upcase #a)   =>  #A
```

### 2.1.4 Symbols

The simple data types we saw above are self-evaluating. Ie, if you typed any object from these data types to the listener, the evaluated result returned by the listener will be the same as what you typed in.

```#t  =>  #t
42  =>  42
#c =>  #c
```

Symbols don't behave the same way. This is because symbols are used by Scheme programs as identifiers for variables, and thus will evaluate to the value that the variable holds. Nevertheless, symbols are a simple data type, and symbols are legitimate values that Scheme can traffic in, along with characters, numbers, and the rest.

To specify a symbol without making Scheme think it is a variable, you should quote the symbol:

```(quote xyz)
=>  xyz
```

Since this type of quoting is very common in Scheme, a convenient abbreviation is provided. The expression

```'E
```

will be treated by Scheme as equivalent to

```(quote E)
```

Scheme symbols are named by a sequence of characters. About the only limitation on a symbol's name is that it shouldn't be mistakable for some other data, eg, characters or booleans or numbers or compound data. Thus, `this-is-a-symbol`, `i18n`, `<=>`, and `\$!#*` are all symbols; `16`, `-i` (a complex number!), `#t`, `"this-is-a-string"`, and `(barf)` (a list) are not. The predicate for checking symbolness is called `symbol?`:

```(symbol? 'xyz) =>  #t
(symbol? 42)   =>  #f
```

Scheme symbols are normally case-insensitive. Thus the symbols `Calorie` and `calorie` are identical:

```(eqv? 'Calorie 'calorie)
=>  #t
```

We can use the symbol `xyz` as a global variable by using the form `define`:

```(define xyz 9)
```

This says the variable `xyz` holds the value `9`. If we feed `xyz` to the listener, the result will be the value held by `xyz`:

```xyz
=>  9
```

We can use the form `set!` to change the value held by a variable:

```(set! xyz #c)
```

Now

```xyz
=>  #c
```

## 2.2 Compound data types

Compound data types are built by combining values from other data types in structured ways.

### 2.2.1 Strings

Strings are sequences of characters (not to be confused with symbols, which are simple data that have a sequence of characters as their name). You can specify strings by enclosing the constituent characters in double-quotes. Strings evaluate to themselves.

```"Hello, World!"
=>  "Hello, World!"
```

The procedure `string` takes a bunch of characters and returns the string made from them:

```(string #h #e #l #l #o)
=>  "hello"
```

Let us now define a global variable `greeting`.

```(define greeting "Hello; Hello!")
```

Note that a semicolon inside a string datum does not trigger a comment.

The characters in a given string can be individually accessed and modified. The procedure `string-ref` takes a string and a (0-based) index, and returns the character at that index:

```(string-ref greeting 0)
=>  #H
```

New strings can be created by appending other strings:

```(string-append "E "
"Pluribus "
"Unum")
=>  "E Pluribus Unum"
```

You can make a string of a specified length, and fill it with the desired characters later.

```(define a-3-char-long-string (make-string 3))
```

The predicate for checking stringness is `string?`.

Strings obtained as a result of calls to `string`, `make-string`, and `string-append` are mutable. The procedure `string-set!` replaces the character at a given index:

```(define hello (string #H #e #l #l #o))
hello
=>  "Hello"

(string-set! hello 1 #a)
hello
=>  "Hallo"
```

### 2.2.2 Vectors

Vectors are sequences like strings, but their elements can be anything, not just characters. Indeed, the elements can be vectors themselves, which is a good way to generate multidimensional vectors.

Here's a way to create a vector of the first five integers:

```(vector 0 1 2 3 4)
=>  #(0 1 2 3 4)
```

Note Scheme's representation of a vector value: a `#` character followed by the vector's contents enclosed in parentheses.

In analogy with `make-string`, the procedure `make-vector` makes a vector of a specific length:

```(define v (make-vector 5))
```

The procedures `vector-ref` and `vector-set!` access and modify vector elements. The predicate for checking if something is a vector is `vector?`.

### 2.2.3 Dotted pairs and lists

A dotted pair is a compound value made by combining any two arbitrary values into an ordered couple. The first element is called the car, the second element is called the cdr, and the combining procedure is `cons`.

```(cons 1 #t)
=>  (1 . #t)
```

Dotted pairs are not self-evaluating, and so to specify them directly as data (ie, without producing them via a `cons`-call), one must explicitly quote them:

```'(1 . #t) =>  (1 . #t)

(1 . #t)  -->ERROR!!!
```

The accessor procedures are `car` and `cdr`:

```(define x (cons 1 #t))

(car x)
=>  1

(cdr x)
=>  #t
```

The elements of a dotted pair can be replaced by the mutator procedures `set-car!` and `set-cdr!`:

```(set-car! x 2)

(set-cdr! x #f)

x
=>  (2 . #f)
```

Dotted pairs can contain other dotted pairs.

```(define y (cons (cons 1 2) 3))

y
=>  ((1 . 2) . 3)
```

The `car` of the `car` of this list is `1`. The `cdr` of the `car` of this list is `2`. Ie,

```(car (car y))
=>  1

(cdr (car y))
=>  2
```

Scheme provides procedure abbreviations for cascaded compositions of the `car` and `cdr` procedures. Thus, `caar` stands for ```car` of `car` of'', and `cdar` stands for ```cdr` of `car` of'', etc.

```(caar y)
=>  1

(cdar y)
=>  2
```

`c...r`-style abbreviations for upto four cascades are guaranteed to exist. Thus, `cadr`, `cdadr`, and `cdaddr` are all valid. `cdadadr` might be pushing it.

When nested dotting occurs along the second element, Scheme uses a special notation to represent the resulting expression:

```(cons 1 (cons 2 (cons 3 (cons 4 5))))
=>  (1 2 3 4 . 5)
```

Ie, `(1 2 3 4 . 5)` is an abbreviation for `(1 . (2 . (3 . (4 . 5))))`. The last cdr of this expression is `5`.

Scheme provides a further abbreviation if the last cdr is a special object called the empty list, which is represented by the expression `()`. The empty list is not considered self-evaluating, and so one should quote it when supplying it as a value in a program:

```'() =>  ()
```

The abbreviation for a dotted pair of the form `(1 . (2 . (3 . (4 . ()))))` is

```(1 2 3 4)
```

This special kind of nested dotted pair is called a list. This particular list is four elements long. It could have been created by saying

```(cons 1 (cons 2 (cons 3 (cons 4 '()))))
```

but Scheme provides a procedure called `list` that makes list creation more convenient. `list` takes any number of arguments and returns the list containing them:

```(list 1 2 3 4)
=>  (1 2 3 4)
```

Indeed, if we know all the elements of a list, we can use `quote` to specify the list:

```'(1 2 3 4)
=>  (1 2 3 4)
```

List elements can be accessed by index.

```(define y (list 1 2 3 4))

(list-ref y 0) =>  1
(list-ref y 3) =>  4

(list-tail y 1) =>  (2 3 4)
(list-tail y 3) =>  (4)
```

`list-tail` returns the tail of the list starting from the given index.

The predicates `pair?`, `list?`, and `null?` check if their argument is a dotted pair, list, or the empty list, respectively:

```(pair? '(1 . 2)) =>  #t
(pair? '(1 2))   =>  #t
(pair? '())      =>  #f
(list? '())      =>  #t
(null? '())      =>  #t
(list? '(1 2))   =>  #t
(list? '(1 . 2)) =>  #f
(null? '(1 2))   =>  #f
(null? '(1 . 2)) =>  #f
```

### 2.2.4 Conversions between data types

Scheme offers many procedures for converting among the data types. We already know how to convert between the character cases using `char-downcase` and `char-upcase`. Characters can be converted into integers using `char->integer`, and integers can be converted into characters using `integer->char`. (The integer corresponding to a character is usually its ascii code.)

```(char->integer #d) =>  100
(integer->char 50)  =>  #2
```

Strings can be converted into the corresponding list of characters.

```(string->list "hello") =>  (#h #e #l #l #o)
```

Other conversion procedures in the same vein are `list->string`, `vector->list`, and `list->vector`.

Numbers can be converted to strings:

```(number->string 16) =>  "16"
```

Strings can be converted to numbers. If the string corresponds to no number, `#f` is returned.

```(string->number "16")
=>  16

(string->number "Am I a hot number?")
=>  #f
```

`string->number` takes an optional second argument, the radix.

```(string->number "16" 8) =>  14
```

because `16` in base 8 is the number fourteen.

Symbols can be converted to strings, and vice versa:

```(symbol->string 'symbol)
=>  "symbol"

(string->symbol "string")
=>  string
```

## 2.3 Other data types

Scheme contains some other data types. One is the procedure. We have already seen many procedures, eg, `display`, `+`, `cons`. In reality, these are variables holding the procedure values, which are themselves not visible as are numbers or characters:

```cons
=>
```

The procedures we have seen thus far are primitive procedures, with standard global variables holding them. Users can create additional procedure values.

Yet another data type is the port. A port is the conduit through which input and output is performed. Ports are usually associated with files and consoles.

In our ``Hello, World!'' program, we used the procedure `display` to write a string to the console. `display` can take two arguments, one the value to be displayed, and the other the output port it should be displayed on.

In our program, `display`'s second argument was implicit. The default output port used is the standard output port. We can get the current standard output port via the procedure-call `(current-output-port)`. We could have been more explicit and written

```(display "Hello, World!" (current-output-port))
```

## 2.4 S-expressions

All the data types discussed here can be lumped together into a single all-encompassing data type called the s-expression (s for symbolic). Thus `42`, `#c`, `(1 . 2)`, `#(a b c)`, `"Hello"`, `(quote xyz)`, `(string->number "16")`, and `(begin (display "Hello, World!") (newline))` are all s-expressions.

[@more@]

• 博文量
32
• 访问量
236188