xv6-riscv-kernel/labs/mmap.html

172 lines
7.5 KiB
HTML
Raw Normal View History

<html>
<head>
<title>Lab: mmap</title>
<link rel="stylesheet" href="homework.css" type="text/css" />
</head>
<body>
<h1>Lab: mmap</h1>
<p>In this lab you will use </tt>mmap</tt> on Linux to demand-page a
very large table and add memory-mapped files to xv6.
<h2>Using mmap on Linux</h2>
<p>This assignment will make you more familiar with how to manage virtual memory
in user programs using the Unix system call interface. You can do this
assignment on any operating system that supports the Unix API (a Linux Athena
machine, your laptop with Linux or MacOS, etc.).
<p>Download the <a href="mmap.c">mmap homework assignment</a> and look
it over. The program maintains a very large table of square root
values in virtual memory. However, the table is too large to fit in
physical RAM. Instead, the square root values should be computed on
demand in response to page faults that occur in the table's address
range. Your job is to implement the demand faulting mechanism using a
signal handler and UNIX memory mapping system calls. To stay within
the physical RAM limit, we suggest using the simple strategy of
unmapping the last page whenever a new page is faulted in.
<p>To compile <tt>mmap.c</tt>, you need a C compiler, such as gcc. On Athena,
you can type:
<pre>
$ add gnu
</pre>
Once you have gcc, you can compile mmap.c as follows:
<pre>
$ gcc mmap.c -lm -o mmap
</pre>
Which produces a <tt>mmap</tt> file, which you can run:
<pre>
$ ./mmap
page_size is 4096
Validating square root table contents...
oops got SIGSEGV at 0x7f6bf7fd7f18
</pre>
<p>When the process accesses the square root table, the mapping does not exist
and the kernel passes control to the signal handler code in
<tt>handle_sigsegv()</tt>. Modify the code in <tt>handle_sigsegv()</tt> to map
in a page at the faulting address, unmap a previous page to stay within the
physical memory limit, and initialize the new page with the correct square root
values. Use the function <tt>calculate_sqrts()</tt> to compute the values.
The program includes test logic that verifies if the contents of the
square root table are correct. When you have completed your task
successfully, the process will print &ldquo;All tests passed!&rdquo;.
<p>You may find that the man pages for mmap() and munmap() are helpful references.
<pre>
$ man mmap
$ man munmap
</pre>
<h2>Implement memory-mapped files in xv6</h2>
<p>In this assignment you will implement memory-mapped files in xv6.
The test program <tt>mmaptest</tt> tells you what should work.
<p>Here are some hints about how you might go about this assignment:
<ul>
<li>Start with adding the two systems calls to the kernel, as you
done for other systems calls (e.g., <tt>sigalarm</tt>), but
don't implement them yet; just return an
error. run <tt>mmaptest</tt> to observe the error.
<li>Keep track for each process what <tt>mmap</tt> has mapped.
You will need to allocate a <tt>struct vma</tt> to record the
address, length, permissions, etc. for each virtual memory area
(VMA) that maps a file. Since the xv6 kernel doesn't have a
memory allocator in the kernel, you can use the same approach has
for <tt>struct file</tt>: have a global array of <tt>struct
vma</tt>s and have for each process a fixed-sized array of VMAs
(like the file descriptor array).
<li>Implement <tt>mmap</tt>: allocate a VMA, add it to the process's
table of VMAs, fill in the VMA, and find a hole in the process's
address space where you will map the file. You can assume that no
file will be bigger than 1GB. The VMA will contain a pointer to
a <tt>struct file</tt> for the file being mapped; you will need to
increase the file's reference count so that the structure doesn't
disappear when the file is closed (hint:
see <tt>filedup</tt>). You don't have worry about overlapping
VMAs. Run <tt>mmaptest</tt>: the first <tt>mmap</tt> should
succeed, but the first access to the mmaped- memory will fail,
because you haven't updated the page fault handler.
<li>Modify the page-fault handler from the lazy-allocation and COW
labs to call a VMA function that handles page faults in VMAs.
This function allocates a page, reads a 4KB from the mmap-ed
file into the page, and maps the page into the address space of
the process. To read the page, you can use <tt>readi</tt>,
which allows you to specify an offset from where to read in the
file (but you will have to lock/unlock the inode passed
to <tt>readi</tt>). Don't forget to set the permissions correctly
on the page. Run <tt>mmaptest</tt>; you should get to the
first <tt>munmap</tt>.
<li>Implement <tt>munmap</tt>: find the <tt>struct vma</tt> for
the address and unmap the specified pages (hint:
use <tt>uvmunmap</tt>). If <tt>munmap</tt> removes all pages
from a VMA, you will have to free the VMA (don't forget to
decrement the reference count of the VMA's <tt>struct
file</tt>); otherwise, you may have to shrink the VMA. You can
assume that <tt>munmap</tt> will not split a VMA into two VMAs;
that is, we don't unmap a few pages in the middle of a VMA. If
an unmapped page has been modified and the file is
mapped <tt>MAP_SHARED</tt>, you will have to write the page back
to the file. RISC-V has a dirty bit (<tt>D</tt>) in a PTE to
record whether a page has ever been written too; add the
declaration to kernel/riscv.h and use it. Modify <tt>exit</tt>
to call <tt>munmap</tt> for the process's open VMAs.
Run <tt>mmaptest</tt>; you should <tt>mmaptest</tt>, but
probably not <tt>forktest</tt>.
<li>Modify <tt>fork</tt> to copy VMAs from parent to child. Don't
forget to increment reference count for a VMA's <tt>struct
file</tt>. In the page fault handler of the child, it is OK to
allocate a new page instead of sharing the page with the
parent. The latter would be cooler, but it would require more
implementation work. Run <tt>mmaptest</tt>; make sure you pass
both <tt>mmaptest</tt> and <tt>forktest</tt>.
</ul>
<p>Run usertests to make sure you didn't break anything.
<p>Optional challenges:
<ul>
<li>If two processes have the same file mmap-ed (as
in <tt>forktest</tt>), share their physical pages. You will need
reference counts on physical pages.
<li>The solution above allocates a new physical page for each page
read from the mmap-ed file, even though the data is also in kernel
memory in the buffer cache. Modify your implementation to mmap
that memory, instead of allocating a new page. This requires that
file blocks be the same size as pages (set <tt>BSIZE</tt> to
4096). You will need to pin mmap-ed blocks into the buffer cache.
You will need worry about reference counts.
<li>Remove redundancy between your implementation for lazy
allocation and your implementation of mmapp-ed files. (Hint:
create an VMA for the lazy allocation area.)
<li>Modify <tt>exec</tt> to use a VMA for different sections of
the binary so that you get on-demand-paged executables. This will
make starting programs faster, because <tt>exec</tt> will not have
to read any data from the file system.
<li>Implement on-demand paging: don't keep a process in memory,
but let the kernel move some parts of processes to disk when
physical memory is low. Then, page in the paged-out memory when
the process references it. Port your linux program from the first
assignment to xv6 and run it.
</ul>
</body>
</html>