The io_read() function and related utilities
The .tar filesystem's io_read() is the standard one that we've seen in the RAM disk—it decides if the request is for a file or a directory, and calls the appropriate function.
The .tar filesystem's tarfs_io_read_dir() is the exact same thing as the RAM disk version—after all, the directory entry structures in the extended attributes structure are identical.
The only function that's different is the tarfs_io_read_file() function to read the data from the .tar file on disk.
int
tarfs_io_read_file (resmgr_context_t *ctp, io_read_t *msg,
iofunc_ocb_t *ocb)
{
int nbytes;
int nleft;
iov_t *iovs;
int niovs;
int i;
int pool_flag;
gzFile fd;
// we don't do any xtypes here...
if ((msg -> i.xtype & _IO_XTYPE_MASK) != _IO_XTYPE_NONE) {
return (ENOSYS);
}
// figure out how many bytes are left
nleft = ocb -> attr -> attr.nbytes - ocb -> offset;
// and how many we can return to the client
nbytes = min (nleft, msg -> i.nbytes);
if (nbytes) {
// 1) open the on-disk .tar file
if ((fd = gzopen (ocb -> attr -> type.vfile.name, "r")) == NULL) {
return (errno);
}
// 2) calculate number of IOVs required for transfer
niovs = (nbytes + BLOCKSIZE - 1) / BLOCKSIZE;
if (niovs <= 8) {
iovs = mpool_malloc (mpool_iov8);
pool_flag = 1;
} else {
iovs = malloc (sizeof (iov_t) * niovs);
pool_flag = 0;
}
if (iovs == NULL) {
gzclose (fd);
return (ENOMEM);
}
// 3) allocate blocks for the transfer
for (i = 0; i < niovs; i++) {
SETIOV (&iovs [i], cfs_block_alloc (ocb -> attr), BLOCKSIZE);
if (iovs [i].iov_base == NULL) {
for (--i ; i >= 0; i--) {
cfs_block_free (ocb -> attr, iovs [i].iov_base);
}
gzclose (fd);
return (ENOMEM);
}
}
// 4) trim last block to correctly read last entry in a .tar file
if (nbytes & BLOCKSIZE) {
iovs [niovs - 1].iov_len = nbytes & BLOCKSIZE;
}
// 5) get the data
gzseek (fd, ocb -> attr -> type.vfile.off + ocb -> offset, SEEK_SET);
for (i = 0; i < niovs; i++) {
gzread (fd, iovs [i].iov_base, iovs [i].iov_len);
}
gzclose (fd);
// return it to the client
MsgReplyv (ctp -> rcvid, nbytes, iovs, i);
// update flags and offset
ocb -> attr -> attr.flags |= IOFUNC_ATTR_ATIME
| IOFUNC_ATTR_DIRTY_TIME;
ocb -> offset += nbytes;
for (i = 0; i < niovs; i++) {
cfs_block_free (ocb -> attr, iovs [i].iov_base);
}
if (pool_flag) {
mpool_free (mpool_iov8, iovs);
} else {
free (iovs);
}
} else {
// nothing to return, indicate End Of File
MsgReply (ctp -> rcvid, EOK, NULL, 0);
}
// already done the reply ourselves
return (_RESMGR_NOREPLY);
}
Many of the steps here are common with the RAM disk version, so only steps 1 through 5 are documented here:
- Notice that we keep the .tar on-disk file closed, and open it only as required. This is an area for improvement, in that you might find it slightly faster to have a certain cache of open .tar files, and maybe rotate them on an LRU-basis. We keep it closed so we don't run out of file descriptors; after all, you can mount hundreds of .tar files with this resource manager.
- We're still dealing with blocks, just as we did in the RAM-disk filesystem, because we need a place to transfer the data from the disk file. We calculate the number of IOVs we're going to need for this transfer, and then allocate the iovs array.
- Next, we call cfs_block_alloc() to get blocks from the block allocator, then we bind them to the iovs array. In case of a failure, we free all the blocks and fail ungracefully. A better failure mode would have been to shrink the client's request size to what we can handle, and return that. However, when you analyze this, the typical request size is 32 KB (8 blocks), and if we don't have 32 KB lying around, then we might have bigger troubles ahead.
- The last block is probably not going to be exactly 4096 bytes in length, so we need to trim it. Nothing bad would happen if we were to gzread() the extra data into the end of the block—the client's transfer size is limited to the size of the resource stored in the attributes structure. So I'm just being extra paranoid.
- And in this step, I'm being completely careless; we simply gzread() the data with no error-checking whatsoever into the blocks! :-)
The rest of the code is standard; return the buffer to the client via MsgReplyv(), update the access flags and offset, free the blocks and IOVs, etc.
Page updated:
