2019-08-02 22:28:55 +02:00
|
|
|
<html>
|
2019-07-30 20:33:09 +02:00
|
|
|
<head>
|
|
|
|
<title>Lab: file system</title>
|
|
|
|
<link rel="stylesheet" href="homework.css" type="text/css" />
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
|
|
<h1>Lab: file system</h1>
|
|
|
|
|
|
|
|
<p>In this lab you will add large files and <tt>mmap</tt> to the xv6 file system.
|
|
|
|
|
|
|
|
<h2>Large files</h2>
|
|
|
|
|
|
|
|
<p>In this assignment you'll increase the maximum size of an xv6
|
|
|
|
file. Currently xv6 files are limited to 268 blocks, or 268*BSIZE
|
|
|
|
bytes (BSIZE is 1024 in xv6). This limit comes from the fact that an
|
|
|
|
xv6 inode contains 12 "direct" block numbers and one "singly-indirect"
|
|
|
|
block number, which refers to a block that holds up to 256 more block
|
|
|
|
numbers, for a total of 12+256=268. You'll change the xv6 file system
|
|
|
|
code to support a "doubly-indirect" block in each inode, containing
|
|
|
|
256 addresses of singly-indirect blocks, each of which can contain up
|
|
|
|
to 256 addresses of data blocks. The result will be that a file will
|
|
|
|
be able to consist of up to 256*256+256+11 blocks (11 instead of 12,
|
|
|
|
because we will sacrifice one of the direct block numbers for the
|
|
|
|
double-indirect block).
|
|
|
|
|
|
|
|
<h3>Preliminaries</h3>
|
|
|
|
|
|
|
|
<p>Modify your Makefile's <tt>CPUS</tt> definition so that it reads:
|
|
|
|
<pre>
|
|
|
|
CPUS := 1
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<b>XXX doesn't seem to speedup things</b>
|
|
|
|
<p>Add
|
|
|
|
<pre>
|
|
|
|
QEMUEXTRA = -snapshot
|
|
|
|
</pre>
|
|
|
|
right before
|
|
|
|
<tt>QEMUOPTS</tt>
|
|
|
|
<p>
|
|
|
|
The above two steps speed up qemu tremendously when xv6
|
|
|
|
creates large files.
|
|
|
|
|
|
|
|
<p><tt>mkfs</tt> initializes the file system to have fewer
|
|
|
|
than 1000 free data blocks, too few to show off the changes
|
|
|
|
you'll make. Modify <tt>param.h</tt> to
|
|
|
|
set <tt>FSSIZE</tt> to:
|
|
|
|
<pre>
|
|
|
|
#define FSSIZE 20000 // size of file system in blocks
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>Download <a href="big.c">big.c</a> into your xv6 directory,
|
|
|
|
add it to the UPROGS list, start up xv6, and run <tt>big</tt>.
|
|
|
|
It creates as big a file as xv6 will let
|
|
|
|
it, and reports the resulting size. It should say 140 sectors.
|
|
|
|
|
|
|
|
<h3>What to Look At</h3>
|
|
|
|
|
|
|
|
The format of an on-disk inode is defined by <tt>struct dinode</tt>
|
|
|
|
in <tt>fs.h</tt>. You're particularly interested in <tt>NDIRECT</tt>,
|
|
|
|
<tt>NINDIRECT</tt>, <tt>MAXFILE</tt>, and the <tt>addrs[]</tt> element
|
|
|
|
of <tt>struct dinode</tt>. Look Figure 7.3 in the xv6 text for a
|
|
|
|
diagram of the standard xv6 inode.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
The code that finds a file's data on disk is in <tt>bmap()</tt>
|
|
|
|
in <tt>fs.c</tt>. Have a look at it and make sure you understand
|
|
|
|
what it's doing. <tt>bmap()</tt> is called both when reading and
|
|
|
|
writing a file. When writing, <tt>bmap()</tt> allocates new
|
|
|
|
blocks as needed to hold file content, as well as allocating
|
|
|
|
an indirect block if needed to hold block addresses.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
<tt>bmap()</tt> deals with two kinds of block numbers. The <tt>bn</tt>
|
|
|
|
argument is a "logical block" -- a block number relative to the start
|
|
|
|
of the file. The block numbers in <tt>ip->addrs[]</tt>, and the
|
|
|
|
argument to <tt>bread()</tt>, are disk block numbers.
|
|
|
|
You can view <tt>bmap()</tt> as mapping a file's logical
|
|
|
|
block numbers into disk block numbers.
|
|
|
|
|
|
|
|
<h3>Your Job</h3>
|
|
|
|
|
|
|
|
Modify <tt>bmap()</tt> so that it implements a doubly-indirect
|
|
|
|
block, in addition to direct blocks and a singly-indirect block.
|
|
|
|
You'll have to have only 11 direct blocks, rather than 12,
|
|
|
|
to make room for your new doubly-indirect block; you're
|
|
|
|
not allowed to change the size of an on-disk inode.
|
|
|
|
The first 11 elements of <tt>ip->addrs[]</tt> should be
|
|
|
|
direct blocks; the 12th should be a singly-indirect block
|
|
|
|
(just like the current one); the 13th should be your new
|
|
|
|
doubly-indirect block.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
You don't have to modify xv6 to handle deletion of files with
|
|
|
|
doubly-indirect blocks.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
If all goes well, <tt>big</tt> will now report that it
|
|
|
|
can write sectors. It will take <tt>big</tt> minutes
|
|
|
|
to finish.
|
|
|
|
|
|
|
|
<b>XXX this runs for a while!</b>
|
|
|
|
|
|
|
|
<h3>Hints</h3>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Make sure you understand <tt>bmap()</tt>. Write out a diagram of the
|
|
|
|
relationships between <tt>ip->addrs[]</tt>, the indirect block, the
|
|
|
|
doubly-indirect block and the singly-indirect blocks it points to, and
|
|
|
|
data blocks. Make sure you understand why adding a doubly-indirect
|
|
|
|
block increases the maximum file size by 256*256 blocks (really -1),
|
|
|
|
since you have to decrease the number of direct blocks by one).
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Think about how you'll index the doubly-indirect block, and
|
|
|
|
the indirect blocks it points to, with the logical block
|
|
|
|
number.
|
|
|
|
|
|
|
|
<p>If you change the definition of <tt>NDIRECT</tt>, you'll
|
|
|
|
probably have to change the size of <tt>addrs[]</tt>
|
|
|
|
in <tt>struct inode</tt> in <tt>file.h</tt>. Make sure that
|
|
|
|
<tt>struct inode</tt> and <tt>struct dinode</tt> have the
|
|
|
|
same number of elements in their <tt>addrs[]</tt> arrays.
|
|
|
|
|
|
|
|
<p>If you change the definition of <tt>NDIRECT</tt>, make sure to create a
|
|
|
|
new <tt>fs.img</tt>, since <tt>mkfs</tt> uses <tt>NDIRECT</tt> too to build the
|
|
|
|
initial file systems. If you delete <tt>fs.img</tt>, <tt>make</tt> on Unix (not
|
|
|
|
xv6) will build a new one for you.
|
|
|
|
|
|
|
|
<p>If your file system gets into a bad state, perhaps by crashing,
|
|
|
|
delete <tt>fs.img</tt> (do this from Unix, not xv6). <tt>make</tt> will build a
|
|
|
|
new clean file system image for you.
|
|
|
|
|
|
|
|
<p>Don't forget to <tt>brelse()</tt> each block that you
|
|
|
|
<tt>bread()</tt>.
|
|
|
|
|
|
|
|
<p>You should allocate indirect blocks and doubly-indirect
|
2019-08-02 22:28:55 +02:00
|
|
|
blocks only as needed, like the original <tt>bmap()</tt>.
|
|
|
|
|
|
|
|
<p>Optional challenge: support triple-indirect blocks.
|
|
|
|
|
|
|
|
<h2>Writing with a Log</h2>
|
|
|
|
|
|
|
|
Insert a print statement in bwrite (in bio.c) so that you get a
|
|
|
|
print every time a block is written to disk:
|
|
|
|
|
|
|
|
<pre>
|
|
|
|
printf("bwrite block %d\n", b->blockno);
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
Build and boot a new kernel and run this:
|
|
|
|
<pre>
|
|
|
|
$ rm README
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>You should see a sequence of bwrite prints after the <tt>rm</tt>.</p>
|
|
|
|
|
|
|
|
<div class="question">
|
|
|
|
<ol>
|
|
|
|
<li>Annotate the bwrite lines with the kind of information that is
|
|
|
|
being written to the disk (e.g., "README's inode", "allocation
|
|
|
|
bitmap"). If the log is being written, note both that the log is being
|
|
|
|
written and also what kind of information is being written to the log.
|
|
|
|
<li>Mark with an arrow the first point at which, if a
|
|
|
|
crash occured, README would be missing after a reboot
|
|
|
|
(after the call to <tt>recover_from_log()</tt>).
|
|
|
|
</ol>
|
|
|
|
</p>
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
<h2>Crash safety</h2>
|
|
|
|
|
|
|
|
<p>This assignment explores the xv6 log in two parts.
|
|
|
|
First, you'll artificially create a crash which illustrates
|
|
|
|
why logging is needed. Second, you'll remove one
|
|
|
|
inefficiency in the xv6 logging system.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Submit your solution before the beginning of the next lecture
|
|
|
|
to <a href="https://6828.scripts.mit.edu/2018/handin.py/">the submission
|
|
|
|
web site</a>.
|
|
|
|
|
|
|
|
<h3>Creating a Problem</h3>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
The point of the xv6 log is to cause all the disk updates of a
|
|
|
|
filesystem operation to be atomic with respect to crashes.
|
|
|
|
For example, file creation involves both adding a new entry
|
|
|
|
to a directory and marking the new file's inode as in-use.
|
|
|
|
A crash that happened after one but before the other would
|
|
|
|
leave the file system in an incorrect state after a reboot,
|
|
|
|
if there were no log.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
The following steps will break the logging code in a way that
|
|
|
|
leaves a file partially created.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
First, replace <tt>commit()</tt> in <tt>log.c</tt> with
|
|
|
|
this code:
|
|
|
|
<pre>
|
|
|
|
#include "kernel/proc.h"
|
|
|
|
void
|
|
|
|
commit(void)
|
|
|
|
{
|
|
|
|
int pid = myproc()->pid;
|
|
|
|
if (log.lh.n > 0) {
|
|
|
|
write_log();
|
|
|
|
write_head();
|
|
|
|
if(pid > 1) // AAA
|
|
|
|
log.lh.block[0] = 0; // BBB
|
|
|
|
install_trans();
|
|
|
|
if(pid > 1) // AAA
|
|
|
|
panic("commit mimicking crash"); // CCC
|
|
|
|
log.lh.n = 0;
|
|
|
|
write_head();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
The BBB line causes the first block in the log to be written to
|
|
|
|
block zero, rather than wherever it should be written. During file
|
|
|
|
creation, the first block in the log is the new file's inode updated
|
|
|
|
to have non-zero <tt>type</tt>.
|
|
|
|
Line BBB causes the block
|
|
|
|
with the updated inode to be written to block 0 (whence
|
|
|
|
it will never be read), leaving the on-disk inode still marked
|
|
|
|
unallocated. The CCC line forces a crash.
|
|
|
|
The AAA lines suppress this buggy behavior for <tt>init</tt>,
|
|
|
|
which creates files before the shell starts.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Second, replace <tt>recover_from_log()</tt> in <tt>log.c</tt>
|
|
|
|
with this code:
|
|
|
|
<pre>
|
|
|
|
static void
|
|
|
|
recover_from_log(void)
|
|
|
|
{
|
|
|
|
read_head();
|
|
|
|
printf("recovery: n=%d but ignoring\n", log.lh.n);
|
|
|
|
// install_trans();
|
|
|
|
log.lh.n = 0;
|
|
|
|
// write_head();
|
|
|
|
}
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
This modification suppresses log recovery (which would repair
|
|
|
|
the damage caused by your change to <tt>commit()</tt>).
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Finally, remove the <tt>-snapshot</tt> option from the definition
|
|
|
|
of <tt>QEMUEXTRA</tt> in your Makefile so that the disk image will see the
|
|
|
|
changes.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Now remove <tt>fs.img</tt> and run xv6:
|
|
|
|
<pre>
|
|
|
|
% rm fs.img ; make qemu
|
|
|
|
</pre>
|
|
|
|
<p>
|
|
|
|
Tell the xv6 shell to create a file:
|
|
|
|
<pre>
|
|
|
|
$ echo hi > a
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
You should see the panic from <tt>commit()</tt>. So far
|
|
|
|
it is as if a crash occurred in a non-logging system in the middle
|
|
|
|
of creating a file.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Now re-start xv6, keeping the same <tt>fs.img</tt>:
|
|
|
|
<pre>
|
|
|
|
% make qemu
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
And look at file <tt>a</tt>:
|
|
|
|
<pre>
|
|
|
|
$ cat a
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
You should see <tt>panic: ilock: no type</tt>. Make sure you understand what happened.
|
|
|
|
Which of the file creation's modifications were written to the disk
|
|
|
|
before the crash, and which were not?
|
|
|
|
|
|
|
|
<h3>Solving the Problem</h3>
|
|
|
|
|
|
|
|
Now fix <tt>recover_from_log()</tt>:
|
|
|
|
<pre>
|
|
|
|
static void
|
|
|
|
recover_from_log(void)
|
|
|
|
{
|
|
|
|
read_head();
|
|
|
|
cprintf("recovery: n=%d\n", log.lh.n);
|
|
|
|
install_trans();
|
|
|
|
log.lh.n = 0;
|
|
|
|
write_head();
|
|
|
|
}
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Run xv6 (keeping the same <tt>fs.img</tt>) and read <tt>a</tt> again:
|
|
|
|
<pre>
|
|
|
|
$ cat a
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
This time there should be no crash. Make sure you understand why
|
|
|
|
the file system now works.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Why was the file empty, even though you created
|
|
|
|
it with <tt>echo hi > a</tt>?
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Now remove your modifications to <tt>commit()</tt>
|
|
|
|
(the if's and the AAA and BBB lines), so that logging works again,
|
|
|
|
and remove <tt>fs.img</tt>.
|
|
|
|
|
|
|
|
<h3>Streamlining Commit</h3>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Suppose the file system code wants to update an inode in block 33.
|
|
|
|
The file system code will call <tt>bp=bread(block 33)</tt> and update the
|
|
|
|
buffer data. <tt>write_log()</tt> in <tt>commit()</tt>
|
|
|
|
will copy the data to a block in the log on disk, for example block 3.
|
|
|
|
A bit later in <tt>commit</tt>, <tt>install_trans()</tt> reads
|
|
|
|
block 3 from the log (containing block 33), copies its contents into the in-memory
|
|
|
|
buffer for block 33, and then writes that buffer to block 33 on the disk.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
However, in <tt>install_trans()</tt>, it turns out that the modified
|
|
|
|
block 33 is guaranteed to be still in the buffer cache, where the
|
|
|
|
file system code left it. Make sure you understand why it would be a
|
|
|
|
mistake for the buffer cache to evict block 33 from the buffer cache
|
|
|
|
before the commit.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
Since the modified block 33 is guaranteed to already be in the buffer
|
|
|
|
cache, there's no need for <tt>install_trans()</tt> to read block
|
|
|
|
33 from the log. Your job: modify <tt>log.c</tt> so that, when
|
|
|
|
<tt>install_trans()</tt> is called from <tt>commit()</tt>,
|
|
|
|
<tt>install_trans()</tt> does not perform the needless read from the log.
|
|
|
|
|
2019-08-02 22:42:55 +02:00
|
|
|
<p>To test your changes, create a file in xv6, restart, and make sure
|
|
|
|
the file is still there.
|
|
|
|
|
|
|
|
<b>XXX Does this speedup bigfile?</b>
|
|
|
|
|
|
|
|
<b>XXX Maybe support lseek and modify shell to append to a file?</b>
|
|
|
|
|
2019-07-30 20:33:09 +02:00
|
|
|
|
|
|
|
</body>
|
|
|
|
</html>
|