# 一个澳洲大学IT类硕士专业的，有关数据库方面的考试题

Document your solution and every step towards the final solution.
Clearly state your name and the student number on the cover page and on each of your additional pages. Staple your
documents and hand them in using the assignment box for 329A in G block. Hand in by 6th June 13:00am.
1. Task: Physical Design, Access Structures 42 points
Consider a database system that uses a variant of B+ Trees for indexing, where leaf nodes contain actual data records, as opposed to pointers to data records (Oracle: index organized table). The index in question is on a unique attribute, which contains no duplicate entries. The database system is to have the following properties:
·Blocks are 4096 bytes in size. Header size is 96 bytes. Usable memory of the nodes is filled up to 85%.
·Each record is 400 bytes in size. A pointer requires 12 bytes. A record pointer (rowid) requires 12 bytes. The search key is 8 bytes in size.
·The data file to be indexed has 10,000,000 entries (records).
(a) How large will the index be maximal? Give your answer in terms of number of blocks, as well as number of bytes. Show each step in your calculations. (13 marks)
(b) How large will the index be maximal, provided that leaf nodes store record pointers instead of actual data records? Give your answer in terms of number of blocks, as well as number of bytes. Show each step in your calculations.(9 marks)
(c) Consider a range query querying 15% of the data. The queried data forms a sequence based on the search key a, e.g. select * from where a between x and y. Which of the following forms of access will be the fastest? Determine, for each of the four choices, the maximal required number of block accesses. Explicitly state any assumptions that you may be making. (16 marks)
i. Access using a B+ Tree with data leaf nodes as in (a).
ii. Access using a B+ Tree as in (b).
iii. Access using a hash-index on the unique attributes (extensible hash, directory in main memory).
iv. Access not using an index but reading the entire heap.
(d) What would be different in (c) if the index were a B* index? (2 marks)
(e) Concept Question: Why is it not possible to construct two or more primary indexes or table clusters for a table? Briefly explain your reasoning. (2 marks)

a)
bytes can be used in one block: (4096-96)*0.85=3400

leaf blocks : ((412*10000000)/3400)=1211765

leaf blocks per branch block=3400/(12+8)=170

the third level branch blocks needed : 1211765/170=7128

the second level branch blocks: 7128/170 = 42

root blocks: 1

blocks summary : 1211765 + 7128 + 42 + 1= 1218936

totoal space size : 1218936 * 4096 / 1024 /1024 = 4762 MB

b)
block usage block: (4096-96)*0.85=3400

leaf blocks : (((12+8)*10000000)/3400)=58824

leaf blocks per branc block=3400/(12+8)=170

the third level branch blocks needed : 58824/170=347

the second level branch blocks: 347/170 = 3

root blocks: 1

blocks summary : 58824 + 347 + 3 + 1 = 59175

totoal space size : 59175 * 4096 / 1024 /1024 = 232 MB

c)
i:
scaned blocks : 1218936 * 0.15 = 182841

ii:
scaned blocks : 59175 * 0.15 = 8877

iii:
hash-index can not be used in predicates such as "between x and y", it can only be used in "=".
so, it will be using full table scan.
the blocks: 1211765

iv:
table blocks: (400*10000000)/3400=1211765
query returned blocks : 1211765*0.15=181765
if full table scan, the blocks scanned is : 1211765

d)
B-trees and Bitmap index keys are used to find rows requiring I/O to process index
Hash gets rows with a key based algorithm, rows are stored based on a hashed value

e)
primary key means one key value can determine exact one row ,
if there are two or more primary keys, it will be conflict with primary key's definition.

• 博文量
35
• 访问量
25544