Rixstep
 About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Home » Learning Curve » Developers Workshop

'ACL bug, root cause'

Laugh? Cry? Buy another iPad? Tap tap tap?


Get It

Try It

The following showed up on Friday as a comment at the Open Radar report for Joel Bruner's bug discovery. The comment is written by a 'lambert.tr' who claims to have been involved in maintaining Apple's lower level ACL code and who obviously knows a lot about what's wrong with Arno's old team. Should one laugh or cry or just buy another iPad? Tap tap tap. H/T lambert.tr.

I don't know whether to laugh or cry... I used to maintain the ACL code in the Mac OS X kernel. This is a user-space bug in the DesktopServices framework.

Although this is not usually a problem, since only foolish/untrained administrators use Finder copies on systems being used as servers, I tried several times to get the Desktop Services folks to fix this. Mac OS X has multiple 'copy engines', and the one in libc gets this right, while the one in the DesktopServices framework gets this wrong.

The problem is that the finder 'copy engine' code sets an ACL in the openx_np() system call, rather than using the chmodx_np() system call after the fact to set an explicit ACL. The ACL it passes to openx_np() is obtained from the source file system object via getattrlist() (but could as easily have come from statx_np()). So the ACL being set is the combination of the ACL set explicitly by the openx_np(), and the ACL being set as a result of the inheritance bit on the container directory in which the new file or directory is being created.

This is in fact necessary, since the only way to make image backups of a subtree such that the copied subtree has exactly the same permissions in the target subtree as it had in the source subtree is to set all of the ACLs that were on the source object onto the target. Anything else loses permissions grants or denials on the copy of the object which were present on the original. This is either inconvenient, in the case of grants, or a critical security bug, in the case of denials.

You can also see where this would be a necessary step for a backup/restore operation, where the date is serialized into an archive format on the backup, and deserialized back into the file system on a restore, which could be a partial archive restore.

Things can get even more complicated when Time Machine and Spotlight are thrown into the mix, since Spotlight adds inherited ACEs to permit it to index directory contents that would otherwise be denied it by ACL, as does Time Machine. (For some reason, they do not share a common group ID and utilise a single shared system functionality ACE, but I digress.) Likewise Time Machine sets an inherited ACE on its backup volume for similar reasons.

The correct fix is to do ACE de-duplication in the case that the target directory container has inherited ACE entries which match the ACE entries on the source object, and remove duplicates from those explicitly listed in the openx_np() call. The alternative approach is to explicitly set exactly the desired ACL on the target after the target is created - this has the drawback that you would need to explicitly know the container ACLs inherited ACE list in order to aggregate it yourself, but has the advantage that you won't be denied access to the object during creation if your openx_np() ACL contains explicit rights grants for the group or user that the creating entity runs under (this should be coupled with a subsequent 'deny everyone' ACE to avoid a security race, which makes this the less desirable workable solution).

Note that the above should make it obvious why a depth-first post-application of ACLs on copied objects wouldn't work: apart from the security problems in the order of operation window, network protocols such as AFP and NFSv$ and SMB all use connection credentials rather than request credentials (NFSv3 uses request credentials) and even privileged users do not have access to other users keychains or session passwords in effect for a given copy operation.

There's obviously a better, easier, and simpler way to do things: scupper the entire sorry mess and use the NeXTSTEP and Unix (libc) APIs that are supposed to be used, dammit. PS. Programmers still playing with Arno's 'beige' Desktop Services code anno 2011 should have their licences revoked.

See Also
Rixstep Learning Curve: .DS_Store Redux
Rixstep Learning Curve: Spatiality Redux
Rixstep Learning Curve: Bride of .DS_Store

The Technological: Desktop Services Store
The Technological: Of Assholes Gadflies Graybeards & Trolls

ACP: Test Drive Xfile!
ACP: ACL: Access Control
ACP: Xfile - The Standard Setter
Open Radar: Finder: Inherited ACL Duplication
Rixstep Learning Curve: Highway -41 Revisited
Rixstep Learning Curve: File Management Macintosh Style
Rixstep Learning Curve: Joel Bruner's '-41' Test with Xfile
Brunerd: Finder's Nasty Inherited ACL Bug (aka Error -41)
Rixstep's Red Hat Diaries: Back Burner (A Pretty Cool Place to Be)
Rixstep Developers Workshop: It Wasn't Good Then, It's No Better Now
Rixstep Industry Watch: Finder's Nasty Inherited ACL Bug (aka Error -41)

About | ACP | Buy | Industry Watch | Learning Curve | News | Products | Search | Substack
Copyright © Rixstep. All rights reserved.