Rixstep
 About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Home » Learning Curve

Weaving One's Way

Through a Unix file system.


Get It

Try It

Everything's a file on Unix. Everything. Directories are files; even your devices are files. [They're seen in /dev but you'll have to drop to a Terminal window or use Xfile to see them.]

This was a big thing back in the days of Snow White and the Seven Dwarfs. Concepts such as 'stream of bytes' and 'everything's a file' were radical.

Unix didn't invent the hierarchical file system but it certainly brought it mainstream. Original MS-DOS - like CP/M before it - admitted only of drive letters and it wasn't until version 2 that Microsoft introduced a type of hierarchy. Apple's first file system for the Macintosh (MFS) wasn't hierarchical either.

Original versions of programs such as od could 'dump' directories and Brian Kernighan used these dumps to show people how elementary the Unix file system actually was.

A Unix file system - like Unix itself - is a quintessential example of finding elegance in simplicity. It's the ultimate signature of Unix co-father Ken Thompson. Unlike other systems Unix puts precious little in the directories themselves, leaving the brunt of the information to other structures.

Unix directory entries contain but two things: the names of the files and their corresponding inodes. An inode is a made up name for a numerical index. Using this numerical index becomes eminently simple once you understand how it's used.

All computer volumes contain a control block. [A volume is a physical or 'virtual' disk - either the entire hard drive or a 'partition'.] Microsoft's infamous 'FAT' has control blocks - two FATs or file allocation tables. FATs are pigeon holes. They're a contiguous array of indexes corresponding to the clusters on a volume.

[Clusters are groups of disk sectors - the smallest possible allocation normally set at 512 bytes today. Computer systems use clusters instead of actual disk sectors as their input/output 'atom' to speed disk operations up.]

Unix has a control block known as the ilist. Again: it's a made up name and you don't need to assign it any special significance. But what's important to visualise is its being contiguous disk space.

So if you can locate the ilist and you have an inode you want information about and you know the size of each chunk of data in the ilist - you could simply multiply the inode (index) times the size of the chunk of data to arrive at the offset you needed. Right?

So if a Unix directory contains only file names and indexes and these indexes 'index' into the ilist - then what's in the ilist?

The short smart-arsed answer: everything else. Of course!

iblock

The data in a Unix volume ilist pertaining to a specific file is called an iblock. Again: it's a made up name. What's important is to understand this block of data is of a uniform size (in addition to being well defined).

Thus the data for any file is located at an offset (size of iblock) * inode from the start of the ilist.

So what does an iblock contain? What does a file need?

  • At the very least you need time stamps. You need stamps for when a file was last accessed and last modified and perhaps a third time for when the file was created or when its status was changed.
  • You also need to know who 'owns' the file. On Unix there are two numeric values associated with every file (and directory and device) and they're both stored here. They're IDs for the file's 'owner' and the file's 'user group'. The pretty names you can forget about for now - they're created elsewhere as needs be.
  • You also need to know who is allowed to do what with the file. But before you decide this you have to decide how you're going to divide up file operations. On Unix there are only three such operations: read, write, and execute. [Other systems such as NetWare and NTFS go way overboard here. A bit of thought shows that all possible permutations can be reduced to the same read, write, and execute operations at some place in the file system.] More on this later.
  • You need to know the file's size. And this no matter it's a directory. You need this later as well when you're double-checking I/O - file reads and writes.
  • You need to know where on the disk the file is located. Unix uses a number of direct and indirect pointers to the clusters where a file is located. The first few entries are for actual disk cluster numbers; as the file grows the following clusters are used only to contain further entries for further cluster numbers; and so forth.

That's enough for now. A file system needs to know more but there's no point in attempting an information overload. What remains can be taken up later.

[Another curiosity is this iblock does not have the file name itself. This can come back to haunt you.]

File Types

Now might be a good time to bring up the subject of Unix file types. Unix file types are not things like 'TextEdit Document' but actual bona fide file types as seen by the file system. On Unix they're eight in number.

  • Regular files. Exactly what you think they are. Files that don't fit in any of the other categories.
  • Named pipes and FIFO buffers. These are used for interprocess communications. A 'FIFO' buffer is a 'first in first out buffer'. Things are put in one end by someone and taken out the other end by someone. Named pipes are another type of dedicated communication channel.
  • Character devices. Normally located in your /dev directory, these are 'devices' that input/output one byte at a time.
  • Block devices. Normally located in your /dev directory, these are 'devices' that input/output one 'block' (often ~100 K) of data at a time.
  • Directories. Unix directories have their own special file type. Yes directories are files just like any other but the system needs to know if it's looking at directories when it processes commands.
  • Symbolic links. Symbolic links contain one thing and one thing only: a (Unix) path to another file.
  • Whiteouts. These critters aren't often used but when used can make other files seem to (temporarily) disappear.
  • Sockets. The mainstay of the Internet and TCP/IP, sockets can have an on-disk representation in Unix.

Octal

Octal numerical representations were all the vogue when Unix was created; today hexadecimal is more commonly used. Octal numbers - as the name implies - have a radix of 8 rather than the 10 of decimal, the 16 of hexadecimal, or the 2 of binary.

The greatest possible digit in octal is therefore 7. [Naturally it's 9 in decimal, 15 or 'F' in hexadecimal, and 1 in binary.]

As with all systems the numbers represent successive powers of the radix. The octal number '777' represents - reading from right to left - '7 times 8 to the power 0' (7) plus '7 times 8 to the power 1' (56) plus '7 times 8 to the power 2' (448) or 511.

Unix file permissions are organised according to file owner or user, file's group, and everybody else. File permissions are a combination of read, write, and execute with read given the octal value 4, write given the octal value 2, and execute given the octal value 1.

Reading Unix file permissions from right to left you'll have one octal digit for permissions for 'everyone else', permissions for the file's group members, and permissions for the file's owner. Note that these permissions apply to all types of files - directories are included here too.

Unix file types are also given in octal. A 'regular file' is given as 100000 octal; a directory as 040000 octal; the other file types are assigned values in a similar fashion.

The Missing Digit

It's plain to see - if you're hawk-eyed or have already been told - that there's a 'digit hole' here: the file types occupy the 5th and 6th octal digits from the right; and the file permissions take the first three digits from the right. The 4th digit hasn't been used yet.

The 4th digit is used for three special cases: the sticky bit, the set GID bit, and the set UID bit.

The sticky bit was originally used on executables (program files) to clue in the virtual memory manager that these files should preferably remain in memory as they're used so often; the set GID bit and the set UID bit are used in privilege escalation, a topic discussed much later!

Putting It All Together

Unix file types and permissions may thus be described in a total of six octal digits.

The 'file' Safari.app (actually a directory of course) has the following octal digits associated with it for file type and permissions: 040775; what can be construed from 040775?

  • The '04' means it's a directory. Which we already knew. But this corroborates it.
  • The '0' after the '04' means it has no special bits. Such as sticky, set GID, or set UID.
  • The '7' after '040' means the file owner has full permissions. 4 (read) + 2 (write) + 1 (execute) is 7.
  • The '7' after '0407' means the file's group has full permissions. 4 (read) + 2 (write) + 1 (execute) is 7.
  • The '5' after '04077' means no one else has write permissions. 4 (read) + 1 (execute) is 5. (No 2 for write.)

drwxrwxr-x

Ultimately 'octal' is the only way to go when working with Unix file permissions but the Unix command line offers an alternative. Listing Safari.app from the command line is done like this.

$ ls -dl Safari.app
drwxrwxr-x   3 root  admin  102 Jul  1 13:28 Safari.app

The command 'ls' is short ('Ken Thompson speak') for 'list'; the 'switches' 'd' and 'l' prefixed with the hyphen stand for 'directory' and 'long' respectively. ls will normally list the contents of any directory given; to have it list the specs for a directory itself the switch 'd' is used. To get all the data (the 'long' listing) the switch 'l' ('ell') is used.

The 'drwxrwxr-x' at the left of the command output corresponds to Safari.app's octal 040775 type and permissions.

  • The 'd' at the beginning unsurprisingly stands for 'directory'. Or the octal '04'.
  • The first group of three characters after that are permissions for the file owner. Or 'rwx' indicating the owner has full (7) permissions.
  • The next group of three characters after that are permissions for the file's group. Or 'rwx' indicating the group has full (7) permissions.
  • The next group of three characters after that are permissions for everyone else. Or 'r-x' indicating everyone else gets only read and execute (5) permissions. (No 2 for write.)

And who owns Safari.app? That's easy to see. They're given in the same command output: root:admin. If you're root or in the admin group you'll be able to write to Safari.app; if you're not you won't.

But writing to Safari.app? What's 'writing' to Safari.app? Is it writing to one of the files in the Safari.app bundle?

Yes it is - but it might not be the way you thought it.

Writing to Safari.app?

Having write permissions on Safari.app literally means you may write to (modify) Safari.app - not the Safari application bundle but the directory itself.

Recap from above: Unix directories contain but two things: file names and inodes. You're not about to go changing an inode - and it'd be rather difficult even if you wanted to; but you might for example want to rename a file?

And go ahead - you can do this with Safari.app if you're root or a member of admin. For in so doing you'll be writing to Safari.app the directory.

Or you might want to place a new file in the Safari.app directory? If you're root or an admin that's your option. It's not advisable but it's your prerogative.

Or removing a file from Safari.app? [There's only one but still and all.] If you're root or an admin you can do it. No one's stopping you.

How about dropping down a level to Contents and renaming that? Only if you're root or an admin.

How about dropping down inside Contents and messing something in there up? That no longer depends on Safari.app - it depends on the permissions on Contents instead.

Get it?

Hands On!

Yet all files have three permissions. What do the read and execute permissions mean on a directory?

You're about to find out.

Follow the code in the box below. Any line with a '$' leading it is something you're supposed to type in; all other lines are command output (and should closely resemble your own).

See if you can figure out what's going on and why certain things paradoxically seem impossible.

[Note: these commands can't harm you - you're not using privilege escalation and you're staying well within the confines of your own home area.]

Last login: Wed Jul 29 23:17:32 on ttyp1
$ cd ~/Documents
$ pwd
~/Documents
$ mkdir TEST
$ ls TEST
$ ls -d TEST
TEST
$ ls -dl TEST
drwxr-xr-x   2 <YOU>  <GROUP>  68 Jul 30 06:46 TEST
$ groups
<YOU> appserveradm appserverusr admin
$ id
uid=501(<YOU>) gid=501(<YOU>) groups=501(<YOU>), 81(appserveradm), 79(appserverusr), 80(admin)
$ cd TEST
$ pwd
~/Documents/TEST
$ ls
$ ls -al
total 0
drwxr-xr-x   2 <YOU>  <GROUP>   68 Jul 30 06:46 .
drwx------   8 <YOU>  <GROUP>  272 Jul 30 06:46 ..
$ cd ..
$ pwd
~/Documents
$ chmod 0 TEST
$ ls -dl TEST
d---------   2 <YOU>  <GROUP>  68 Jul 30 06:46 TEST
$ ls -al TEST
ls: TEST: Permission denied
$ cd TEST
-bash: cd: TEST: Permission denied
$ chmod 755 TEST
$ ls -dl TEST
drwxr-xr-x   2 rixstep  staff  68 Jul 30 06:46 TEST
$ ls -al TEST
total 0
drwxr-xr-x   2 <YOU>  <GROUP>   68 Jul 30 06:46 .
drwx------   8 <YOU>  <GROUP>  272 Jul 30 06:46 ..
$ cd TEST
$ pwd
~/Documents/TEST
$ cd ..
$ pwd
~/Documents
$ rmdir TEST
$ ls -dl TEST
ls: TEST: No such file or directory
$

Homework #1

A friend writes to you and says he's having trouble installing LaTeX. He tries to copy the files where he's been told to copy them but he keeps getting 'permission denied'. He insists he has permission - he's inspected each and every last file in the LaTeX setup and he has permissions on them all. He's supposed to be installing LaTeX to one of the 'Unix' 'bin' directories.

Can you tell him what's wrong and what's wrong with his way of thinking?

Homework #2

The system login items root privilege escalation hole (still open on Tiger 10.4, Leopard 10.5 and Snow Leopard 10.6) lets a rogue process inject startup instructions in a file located in /Library/Preferences. An oft-heard solution for the hole is to drop 'write bits' on the file and set its ownership to root:admin. [If the file doesn't exist you simply create it first.]

[Note: you normally need to be a member of the 'admin' group to use the 'sudo' command.]

Last login: Wed Jul 30 23:23:52 on ttyp1
$ cd /Library/Preferences
$ pwd
/Library/Preferences
$ touch com.apple.systemloginitems.plist
$ ls -l com.apple.systemloginitems.plist
-rw-r--r--   1 <YOU>  <GROUP>  0 Jul 30 23:34 com.apple.systemloginitems.plist
$ chmod 0 com.apple.systemloginitems.plist
$ ls -l com.apple.systemloginitems.plist
----------   1 <YOU>  <GROUP>  0 Jul 30 23:34 com.apple.systemloginitems.plist
$ echo 'Trying to write to the file.' >com.apple.systemloginitems.plist
-bash: com.apple.systemloginitems.plist: Permission denied
$ sudo chown 0:0 com.apple.systemloginitems.plist
Password:
$ ls -l com.apple.systemloginitems.plist
----------   1 root    wheel   0 Jul 30 23:34 com.apple.systemloginitems.plist
$

So the file appears unassailable - it's owned by root:wheel (the highest owner/group) and it has no permissions whatsoever. You can't write to it and you can't read it either!

But your system is still wide open to attack by this exploit. Can you figure out why?

Cleanup

Best to clean up after the last assignment. If there's nothing in that system login items file just remove it.

$ rm /Library/Preferences/com.apple.systemloginitems.plist

Homework 3

Read up on what sticky bits do today and then try to figure out why (starting with 10.4 Tiger) /Library has one.

$ ls -dl /Library
drwxrwxr-t   41 root  admin  1394 Jun 27 05:14 /Library
About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Copyright © Rixstep. All rights reserved.