Assignment I: B-Trees

In this first assignment, you will be implementing a series of functions for manipulating B-Trees in chidb. This assignment is divided into the following steps:

  1. Opening a chidb file

  2. Loading a B-Tree node from the file

  3. Creating and writing a B-Tree node to disk

  4. Manipulating B-Tree cells

  5. Finding a value in a B-Tree

  6. Insertion into a leaf without splitting

  7. Insertion with splitting

  8. Supporting index B-Trees

All of these steps can be developed independently (the only exception is step 8, which is a cross-cutting feature). Nonetheless, testing a step will usually require that some of the previous steps be implemented (e.g., you can’t find a value in a B-Tree if you can’t open the file itself).

Before you get started

Before you start working on this assignment, make sure read The chidb File Format. You are not expected to understand the entire document at first, but you should at least get a general feel for how a chidb file is organized.

Most of your work will be done on the files src/libchidb/btree.[hc]. The header file contains declarations of functions you will have to implement, and which are referred to in the rest of the document. The specification of what the functions must do is included in the source as comments.

Step 1: Opening a chidb file

Implement the following functions:

int chidb_Btree_open(const char *filename, chidb *db, BTree **bt)
int chidb_Btree_close(BTree *bt);

Take into account that the chidb_Btree_open function can open an existing file, but can also create an empty database file if given the name of a file that does not exist. The latter functionality will not be possible until Step 3 is completed.

Step 2: Loading a B-Tree node from the file

Implement the following functions:

int chidb_Btree_getNodeByPage(BTree *bt, npage_t npage, BTreeNode **node);
int chidb_Btree_freeMemNode(BTree *bt, BTreeNode *btn);

Step 3: Creating and writing a B-Tree node to disk

Implement the following function:

int chidb_Btree_newNode(BTree *bt, npage_t *npage, uint8_t type);
int chidb_Btree_initEmptyNode(BTree *bt, npage_t npage, uint8_t type);
int chidb_Btree_writeNode(BTree *bt, BTreeNode *node);

Step 4: Manipulating B-Tree cells

Implement the following functions:

int chidb_Btree_getCell(BTreeNode *btn, ncell_t ncell, BTreeCell *cell);
int chidb_Btree_insertCell(BTreeNode *btn, ncell_t ncell, BTreeCell *cell);

Take into account that, once you’ve implemented chidb_Btree_getCell (which is the simpler of the two), you will be able to implement chidb_Btree_find or use the provided chidb_Btree_print. At that point, you will be able to open example database files (available at the chidb website) and verify that you can correctly print out their contents or search for specific values.

Step 5: Finding a value in a B-Tree

Implement the following function:

int chidb_Btree_find(BTree *bt, npage_t nroot, key_t key, uint8_t **data, uint16_t *size);

Step 6: Insertion into a leaf without splitting

Implement the following functions:

int chidb_Btree_insertInTable(BTree *bt, npage_t nroot,
                              key_t key, uint8_t *data, uint16_t size);
int chidb_Btree_insert(BTree *bt, npage_t nroot, BTreeCell *btc);
int chidb_Btree_insertNonFull(BTree *bt, npage_t npage, BTreeCell *btc);

Take into account that, at this point, chidb_Btree_insert will be little more than a call to chidb_Btree_insertNonFull. Also, even if at this point you are only inserting in leaf nodes, that doesn’t mean that your implementation shouldn’t work on a database file that does have internal nodes. So, chidb_Btree_insertNonFull should still traverse the tree in search of the leaf node to insert the cell in (but assuming that splitting will not be necessary)

Since the chidb file format is a subset of the SQLite format, once this step is completed you will be able to create a chidb file and open it in SQLite, as long as you include a valid schema table in the file.

Step 7: Insertion with splitting

Implement the following function:

int chidb_Btree_split(BTree *bt, npage_t npage_parent, npage_t npage_child,
                                 ncell_t parent_cell, npage_t *npage_child2);

Note that you will also have to modify chidb_Btree_insert and chidb_Btree_insertNonFull to split nodes when necessary.

Step 8: Supporting index B-Trees

Supporting index B-Trees affects almost all of the previous functions. As such, this step can either be done from the very beginning (at the cost of complicating the implementation of the previous functions), or added at the end (simplifying the implementation of the previous functions, but adding more work at the end of this assignment).

Besides modifying the previous functions, you will also have to implement the following function:

int chidb_Btree_insertInIndex(BTree *bt, npage_t nroot, key_t keyIdx, key_t keyPk);