|Home » Learning Curve » Developers Workshop
Keep track of what's going on.
LONDON (Rixstep) — This started yesterday as a piece tentatively titled 'GDE Screenshots December 2017', but, based on further research, must of needs be given a new moniker.
What began as a desultory hike through the obfuscated mysteries of the 'dot directories' under root on macOS 10.13 becomes anything but innocent.
/.DocumentRevisions-V100, found under root since OS X 10.7 Lion, and meant to be a user enhancement, is potentially a snake pit. Complaints and alerts about it are found since 10.7. Most users are worried about storage issues, but there is an additional overriding issue most often not mentioned: user privacy and security.
In any number of obvious scenarios, users of Macs may want to 'purge' their systems, from 'prying eyes', from 'hostile eyes', or for when a Mac is to be sold to another user. As this hive is the domain of Time Machine, Time Machine offers users the opportunity to purge what's there, but how many take the necessary steps when needed?
But this piece is actually about APFS, macOS 10.13 'High Sierra', the ACP application GDE, and the Unix APIs getdirentries and readdir. It's deemed interesting because it's always good (beneficial) to see what's lurking in one's own file system, and because Apple announced that yet another API shall meet the chopping block: getdirentries.
getdirentries feeds readdir, which, as we all know, is the programmatic way to read a Unix directory (on a 'higher' level).
readdir filters out some of the 'bad stuff' that getdirentries tosses its way. getdirentries simply reveals all.
What's the matter with 'all'? Nothing. But some of what readdir omits is not necessary. Normally that is. Like zero inodes.
GDE is the Rixstep dev tool for getting into the innards of your file system through getdirentries. It's how it was discovered how and where Apple had been hiding 'hard links'. Apple kept running and hiding, but GDE kept up.
Here's GDE looking at boot system root on 10.13 'High Sierra' (with APFS) in user mode.
Note: directory contents are not sorted on APFS. But GDE has a keyboard shortcut to take care of that.
There are several very interesting things here, apart from the app and the listings themselves.
The first? The 'no-go zones'. They are:
.HFS+ Private Directory Data
So let's start from the top - with perhaps the most interesting and most crucial of them all.
Yep, this guy is normally chock-full of all the kinds of files you don't want but Apple think you do. This is a repository used by Time Machine. If you're not using Time Machine, you might find this hive relatively empty. But don't count on it. More and more Apple apps are partaking of the technology. There are a number of assorted risks here.
More in a bit.
The engine what runs Spotlight. Perhaps you've already turned it off?
.HFS+ Private Directory Data
The actual name of this critter is '.HFS+ Private Directory Data\15'. The '\15' tacked on the end is octal for 'carriage return' (CR). The '\15' can show up in GDE, but not so many other places.
The '\15' is put there to make it as difficult as possible for you to get in to look around (yes on your own computer). The 'CR' isn't used for 'carriage return / line feed' anymore since Unix™ came to town. Essentially it should be very difficult for you to generate that character on your own.
[Yes, Apple did this a while back on a similar directory in a similar situation - to hide hard links. Does Tim know this? Ed.]
.HFS+ Private Directory Data is used today for Time Machine.
This is where stuff gets downloaded that's going to be installed. Yes, it normally happens in the background without you knowing. But it usually won't follow through with the install (even though it can - oops) without your permission.
[A recent exception was CVE-2017-13872. They weren't taking any chances. Ed.]
Seems the same as .PKInstallSandboxManager but for - wait for it - 'system software'? Ah. But it seems to have given many people headaches over the years, albeit occasionally in other locations.
For Spotlight version 1.00. (What's in a name?) If you have Spotlight turned off, you'll still find files here, but that's OK.
Through Tim's Glasses Darkly
Now it's time to look what's inside those critters. Being as they're marked 'no-go', a special sort of legerdemain is needed.
Suddenly things aren't as 'no-go', inaccessible, anymore.
.DocumentRevisions-V100 is considerable. And with very considerable nomenclature.
It's under 'PerUID' and more specifically '501' that you're going to find the mostest.
Perhaps 40 MB or more? (This critter just grows and grows.)
[Yes, they all turn up 'blockless'. Check their flags: they're compressed. And check their 'optimal block size'. Yet another public NSWorkspace technology that Apple keep to their Smeagol selves. Ed.]
But a word of caution: .DocumentRevisions-V100 can contain some pretty nasty stuff. If you're at all concerned with, or in need of, 'infosec' (security) then you best find a way to purge that hive - lots of things you thought were long gone could be lurking there.
This can normally be done through Time Machine, even if some users have experienced difficulties. What's important is that you remember that this hive can have fragments of files long forgotten.
ChunkStorage on this pristine APFS box had some 40 MB buried way down deep.
db-V1 ('database version 1'?) has a meg with two SQLite files. Complex!
purgatory and staging are empty. Thank goodness.
.Spotlight-V100 has only two files (VolumeConfiguration.plist and Store-V1/VolumeConfig.plist) if you've turned it off.
.HFS+ Private Directory Data\15 should be empty if you're not using Time Machine, and .PKInstallSandboxManager and .PKInstallSandboxManager-SystemSoftware should be empty if you're keeping your box up to date. (You are doing that, aren't you?)
And now the second of a number of very interesting things. This was in the first two screenshots on this page.
Here's the first one again.
The zero inodes are gone. From 26 December 2010:
The Zero Inode
But Apple take things a step further. They have elements of their file system they don't want anyone to see - stuff on your computer they're making it as difficult as possible for you to know about and find.
One of the techniques they use is the zero inode.
'Inodes' are indexes into a volume control block - a contiguous otherwise unmapped section of your hard drive (one per volume) that contains all the data on the files on that volume: where the file is located (actual 'extents'); permissions, special flags, ownership, etc.
The inodes start at index 1 - this because index 0 (the zero inode) has a special meaning. The zero inode is a signal to a file system's 'cleanup' routines: it signals a file that's to be removed from the file system.
Files aren't immediately cleaned away when completely unlinked: the file system incorporates lazy write to save on wear and tear on the hard drive. The file system waits until the drive controller is in the vicinity of a pending write operation.
Apple 'use' ('abuse') this facility to keep even more files hidden from view. They know - their code knows - these files aren't to be cleared away: they're just hidden, really hidden. Ordinary file system APIs won't pick them up.
But as with everything else, there's ways around it all.
The 'd_ino' field at the far left is the inode field. The items '.HFS+ Private Directory Data' (used by Time Machine); '.journal' and '.journal_info_block' (used by the journaling system); and '␀␀␀␀HFS+ Private Data' (used for banished multi-linked files) all have a zero inode. They're not taken up by higher level file system APIs. Only lower level ones.
Two of the items are considered 'cloaked' - they're not mentioned at all at higher levels. (The number of cloaked items is arrived at by a simple subtraction.)
[Unix directories aren't accessible with standard APIs anymore. Dedicated APIs are needed instead. And they're a bit more complex as well. The main reason for this is Unix file names are no longer limited to 14 bytes (and haven't been for a long long time). The four fields in the above screenshot are for the file's inode, the length of the record, the item type such as 'regular', directory, or symbolic link, the length of the file name, and finally the file name itself. It could be argued that one of the two fields 'd_reclen' and 'd_namelen' is redundant.]
There are four (4) 'zero inodes' in the above graphic. (Check the leftmost 'd_ino' column.)
Now compare with the screenshots from the current version of the OS.
No zero inodes. And nothing cloaked either.
Times are changing.
NSWorkspace is supposed to have technologies such as compressing, decompressing, encrypting, and decrypting files.
Those technologies were meant to be inherited from NeXTSTEP/OPENSTEP code. They never made it. Instead, Apple, under growing pressure from the greybeards, caved in to the unthinkable and inexcusable: they ported the NeXT technologies to their blasted
Loser Finder where they do not belong.
The ramifications of this architectural blunder and betrayal are bewildering both in number and in scope.
To many users, .DocumentRevisions-V100 is both a panicky encumbrance and a tangible security/privacy risk.
This discussion began many years ago, with 10.7 Lion:
Many at the forum didn't know what .DocumentRevisions-V100 was.
Then Alan Goodall (not to be confused with Jane, he says) had this to contribute:
A comment in the same thread:
That's a huge amount of space.
Unfortunately, Apple doesn't provide a way to examine it, to see which files have which old versions.
You can delete all old versions of individual files, by clicking just to the right of the file name in the title bar, to bring up the Versions Browser, then hold the Alt/Option key and click the name in the title bar on the right and selecting Delete All Versions.
One can, of course, use tools more powerful than
Loser Finder, but that's another topic.
More pressing than file system waste, however, is the tangible privacy/security risk.
A word from Phil Stokes.
Over the last few years, Apple have made great strides in protecting users from losing their data, be it from system failure, software crashes, accidental deletion, disk corruption or just the plain negligence of forgetting to save before quitting. We now have Time Machine for automatic backups, application savedStates and Resume for crashes, and Autosave and Versions for negligence. As if all that wasn't enough, iCloud is probably syncing your browser tabs, photos, and pretty much anything else you want straight up to Apple's servers and pushing it back down the pipe to your other devices as and when needed. All this is a good thing, right?
Well, probably. For most people, most of the time. But not always. The security implications of having your OS (and even Apple) copying everything you type, open or edit on your computer can sometimes be disturbing. What if you need to open a confidential pdf in Preview but are required to make sure (either morally or contractually) that all copies of that document are destroyed after viewing? No one wants to be zeroing their hard-drive every week; and what if you need to edit a Pages or Numbers document but don't want the changes pushed to the cloud? Turning iCloud on and off is no 2-second job...
The only thing to add is that iCloud is not safe, period. Apple might be the latecomer to the PRISM party, but they're still a member of that egregious fellowship where one ISP CEO got jailed for failing to comply, and where their very first member, Microsoft, bought Skype so they could leverage data to the NSA underneath the encryption layer.
But what's most important, as regards .DocumentRevisions-V100, is that you keep track of what's going on there, and purge that hive regularly, by hook or by crook.
Take a look inside. You'll be amazed at what you find.
Wikipedia: PRISM (surveillance program)
applehelpwriter: Keeping OS X's nose out of your data