Formally Specifying POSIX File Systems Gian Ntzik , Pedro da Rocha Pinto and Philippa Gardner Imperial College London July 16, 2015 1/39
POSIX File Systems ◮ POSIX: Portable Operating System Interface ◮ Large part devoted to the file system ◮ English Specification ◮ Underspecified ◮ Ambiguous ◮ Absence of proper memory model 2/39
Objectives ◮ Find formalism suitable for specifying POSIX file system operations ◮ Suitable for clients and implementations ◮ Client reasoning in a program logic ◮ Focus a core fragment: ◮ Structural operations: mkdir , rmdir , link , unlink , rename , ... ◮ IO: open , read , write , lseek , close , ... 3/39
File System Example 0 tmp usr 1 2 share local .X0-lock bin 3 0101111011... 8 5 4 lib git 7 0110011011... 9 4/39
First Challenge: Atomicity 5/39
Example: unlink ◮ unlink ( p ) : Atomically remove the file identified by path p . ◮ “Atomic” has an unusual meaning in POSIX. 6/39
unlink ( / usr / bin / git ) 0 tmp usr 1 2 share local .X0-lock bin 3 0101111011... 8 5 4 lib git 7 9 0110011011... 7/39
unlink ( / usr / bin / git ) 0 tmp usr 1 2 share local .X0-lock bin 3 0101111011... 8 5 4 lib git 7 9 0110011011... ◮ Only removing git is required to be atomic. 7/39
unlink ( / usr / bin / git ) 0 tmp usr 1 2 share local .X0-lock bin 3 0101111011... 8 5 4 lib git 7 9 0110011011... ◮ Only removing git is required to be atomic. ◮ Path resolution: a sequence of atomic reads. 7/39
Sequence of atomic actions unlink(/usr/bin/git) usr bin git FS 1 FS 1 FS 2 FS 2 FS 3 FS 4 Environment: mupltiple atomic updates Thread: single atomic read in path traversal Thread: single atomic update 8/39
Single atomic step specification: Atomic Hoare Triple From TaDA, IRIS, ... we know how to specify a single atomic step. ℂ P Q � P � C � Q � 9/39
Multi-atomic specifications We extend to multiple atomic steps ℂ ... P n P 1 P 2 P 3 P 4 P n − 1 � � C ⊑ ( P 1 , P 2 ); ( P 3 , P 4 ); . . . ; ( P n − 1 , P n ) 10/39
Multi-atomic Program Logic ◮ Introduce multi-atomics: � � C ⊑ ( P 1 , P 2 ); ( P 3 , P 4 ); . . . ; ( P n − 1 , P n ) ◮ Justified by an encoding in IRIS ◮ Extend reasoning rules for single atomic steps to multiple steps 11/39
Single Step Equivalence SingleAtomic: � � C ⊑ ( P , Q ) ⇐ ⇒ � P � C � Q � 12/39
Sequence Rule � � � � C 1 ⊑ . . . ; ( P 1 , Q 1 ) C 2 ⊑ ( P 2 , Q 2 ); . . . Multi-Seq � � C 1 ; C 2 ⊑ . . . ; ( P 1 , Q 1 ); ( P 2 , Q 2 ); . . . 13/39
Stuttering Rule � � C ⊑ . . . ; ( P , P ); ( P , Q ); . . . Stutter � � C ⊑ . . . ; ( P , Q ); . . . 14/39
unlink : Formal Specification unlink(/usr/bin/git) usr bin git FS 1 FS 1 FS 2 FS 2 FS FS' unlink ( / usr / bin / git ) ⊑ � resolve( / usr / bin , ι 0 , r ); . . . � 15/39
unlink : Formal Specification unlink(/usr/bin/git) usr bin git FS 1 FS 1 FS 2 FS 2 FS FS' unlink ( / usr / bin / git ) ⊑ � resolve( / usr / bin , ι 0 , r ); (fs( FS ) , in( FS , r , git ) ⇒ rem( FS , r , git ) ∗ ret = 0) � 16/39
unlink : Formal Specification unlink(/usr/bin/git) usr bin git FS 1 FS 1 FS 2 FS 2 FS FS' unlink ( / usr / bin / git ) ⊑ � � � in( FS , r , git ) ⇒ rem( FS , r , git ) ∗ ret = 0 ��� resolve( / usr / bin , ι 0 , r ); fs( FS ) , ∧ out( FS , r , git ) ⇒ fs( FS ) ∗ ret = − 1 ∗ errno = ENOENT 17/39
unlink : Formal Specification unlink(/usr/bin/git) usr bin git FS 1 FS 1 FS 2 FS 2 FS FS' unlink ( / usr / bin / git ) ⊑ � in( FS , r , git ) ⇒ rem( FS , r , git ) ∗ ret = 0 � � � fs( FS ) , r ∈ In ⇒ resolve( / usr / bin , ι 0 , r ); ∧ out( FS , r , git ) ⇒ fs( FS ) ∗ ret = − 1 ∗ errno = ENOENT ∧ r ∈ Err ⇒ fs( FS ) ∗ ret = − 1 ∗ errno = r 18/39
Second Challenge: Unordered actions 19/39
Specifying unordered actions ◮ Example: rename ( p / a , p ′ / b ) ◮ POSIX does not specify in which order p and p ′ are resolved 20/39
Specifying unordered actions ◮ Example: rename ( p / a , p ′ / b ) ◮ POSIX does not specify in which order p and p ′ are resolved ◮ We can’t do: rename ( p / a , p ′ / b ) ⊑ � resolve( p , ι 0 , r 1 ) ; resolve( p ′ , ι 0 , r 2 ) ; . . . � 20/39
Specifying unordered actions ◮ Example: rename ( p / a , p ′ / b ) ◮ POSIX does not specify in which order p and p ′ are resolved ◮ Solution: rename ( p / a , p ′ / b ) ⊑ � resolve( p , ι 0 , r 1 ) � resolve( p ′ , ι 0 , r 2 ) ; . . . � 21/39
Parallel Rule � � C 1 ⊑ ( P 1 , P 2 ); . . . ; ( P n − 1 , P n ) � ( Q 1 , Q 2 ); . . . ; ( Q n − 1 , Q n ) � C 1 ⊑ � ( P 1 , P 2 ); . . . ; ( P n − 1 , P n ) � ( Q 1 , Q 2 ); . . . ; ( Q n − 1 , Q n ) � Multi-Par C 1 � C 2 ⊑ 22/39
Client Applications ◮ Lock Files ◮ POSIX pipes ◮ POSIX Advisory Locks/Record Locking (on-going) ◮ Persistent Concurrent Queues (future) 23/39
Lock Files ◮ lock ( path ): atomically create a non-existing lock file at path ◮ unlock ( path ): remove the lock file identified by path ◮ Implemented similarly to spin locks ◮ open ( path , O CREAT | O EXCL ) to try to lock ◮ unlink to unlock 24/39
Lock File Implementation function lock ( path ) { do { fd := open ( path , O EXCL | O CREAT ); } while ( fd = − 1); close ( fd ); } function unlock ( path ) { unlink ( path ); } 25/39
Heap Based Lock Specification ◮ We know how to specify locks on the heap 26/39
Heap Based Lock Specification ◮ We know how to specify locks on the heap � � � � emp makelock () Lock( ret , 0) � Lock( x , v ) � lock ( x ) � Lock( x , 1) ∗ v = 0 � � Lock( x , 1) � unlock ( x ) � Lock( x , 0) � 26/39
Heap Based Lock Specification ◮ We know how to specify locks on the heap � � � � emp makelock () Lock( ret , 0) � Lock( x , v ) � lock ( x ) � Lock( x , 1) ∗ v = 0 � � Lock( x , 1) � unlock ( x ) � Lock( x , 0) � ◮ Ideally, we want the same specification for lock files ◮ Using path instead of heap address x 26/39
Third Challenge: Ownership 27/39
The Heap case Allocated Heap makeLock() Allocated Heap 0 ret ◮ The module owns the newly allocated memory 28/39
The Heap case Allocated Heap makeLock() Allocated Heap 0 HL ret ◮ The module enforces the sharing protocol 29/39
The File System case 0 tmp usr 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 30/39
The File System case 0 tmp usr 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 ◮ No operation to extend with fresh path, unknown to environment ◮ Global path address space 30/39
The File System case 0 tmp usr LF 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 31/39
The File System case 0 tmp usr LF 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 ◮ Clients must agree on sharing protocol 31/39
The File System case 0 tmp usr LF 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 ◮ Clients must agree on sharing protocol ◮ Cooporative ownership 31/39
Lock file specification If any client’s protocols 0 tmp usr 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 32/39
Lock file specification contains the lock file protocol 0 tmp usr LF 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 33/39
Lock file specification with other protocols 0 tmp usr 1 2 share .X0-lock local bin 3 8 5 4 lib git 7 0110011011... 9 34/39
Lock File specification then, we can use the lock file specification: � � lock ( path ) ⊑ Lock( path , v ) , Lock( path , 1) ∗ v = 0 � � unlock ( path ) ⊑ Lock( path , 1) , Lock( path , 0) 35/39
Lock File specification � � ∀ path , P . islock path , P ⇒ � � lock ( path ) ⊑ Lock( path , v ) , Lock( path , 1) ∗ v = 0 � � unlock ( a ) ⊑ Lock( path , 1) , Lock( path , 0) � � � ∃ R . P ⇚ ⇛ LF( path ) ∗ R islock path , P 36/39
Conclusions ◮ Introduced multi-atomic specifications ◮ Ordered sequences of atomic actions ◮ Unordered parallel atomic actions ◮ Formalised a fragment of POSIX file system operations ◮ Client reasoning ◮ Ownership in file systems is cooperative 37/39
Future Work ◮ Link with operational models & testing ◮ Verify implementations ◮ Explore connection with refinement & Hoare’s algebraic laws ◮ Mechanisation 38/39
Recommend
More recommend