--- type: theoretical backlinks: - "[[Memory Management]]" --- A file system consists of two parts - Collection of files - A directory structure -> provides information about all files in the system ## File - Logical view -> the unit of storing data Files are mapped by the OS onto physical nonvolatile devices **Types:** - Data - Numeric - Character - Binary - Program **Attributes**: - Name - Identifier (unique number) - Type[^2] - Location -> pointer - Size - Protection (permissions) - Datetime and user id All of these are stored in **i-nodes**. ### INodes - Size in biytes - Access permissions - Type - Creation and last access datetime - Owner ID - Group ID - Hard link count ### Logical Definition - Named collection of related information - Files may have free form (text files) or can be rigidly formatted[^1] ### Operations - Create - Write - Read - Seek (reposition within file) - Delete - Truncate - shorten or cut off by removing data from the end - Open (load to memory) - Close (unload) ### Open files Tracked by an **open-file table**, counted by **file-open count**. In order to [avoid race conditions](Inter-Process%20Communication.md#Avoiding%20race%20conditions), we need to lock the files somehow. - **Shared lock** -> several processes can acquire concurrently, used for reads - **Exclusive lock** -> writer lock - Mandatory vs. advisory -> access is denied depending on locks held and requested vs. processes can find status of locks and decide what to do ### Structure Could be many: - None - Simple record - Lines - Fixed length - Variable length - Complex - Formatted document - Relocatable load file [^3] ## Directories Collection of nodes containing information about all files. Also resides on disk. **Operations**: - Search for a file - Create a file - Delete a file - List a directory - Rename a file - Traverse file system ### Single level directory A single directory for all users. Clearly, we need unique names, which can become a problem real fast. That shit is gonna grow super big. ### Two-level directory Users have different directories. In Linux -> `/home/user` is separate, allowing for the same file names. Linux, however, uses a multi-level: ### Tree-Structured Directories - Efficient searching - Grouping - Absolute v. relative path ### Acyclic-Graph Have shared subdirectories and files. Symlinks achieve this. ### Structure In Linux, it is a table (a file) which stores: - File name - Inode ## Symlinks **Hard** vs **Soft**. Hard is a literal copy of the file but keep the same inode info, while soft is just a pointer. >[!IMPORTANT] >We only allow links to files to avoid cycles Every time a new link is added we also use a cycle detection algorithm to determine whether it is OK ## Disk Can be subdivided into **partitions**. Disk/partition can be used **raw** (no file system) or can be **formatted**. The entity containing the file system is known as a volume. > [!NOTE]- Typical fs organization > ![](Pasted%20image%2020250505144352.png) ### Layout ![](Pasted%20image%2020250505155546.png) - **Boot block** - Contains initial bootstrap program to load the OS - Typically the first sector reads another program from the next few sectors - **Super block** - state of the file system - Type -> ext3,ext4,FAT, etc. - Size -> Number of blocks - Block size - Block group information -> number of block groups in file system - Free block count - Free inode count - Inode size - FS mount info - Journal info ### Free space management Unix uses a bitmap to show free disk blocks. Zero=free, one=in use ## Access lists and groups Read, write and execute. Three classes of users on Linux 1. Owner -> 7 (Read Write Execute) 2. Group -> 6 (RW) 3. Public -> 1 (X) ## Blocks The IDs of data blocks are stored in [INodes](File%20Systems%20Management.md#INodes), the IDs of the first 12 blocks are stored in direct reference fields. ![](Pasted%20image%2020250505154746.png) ### Allocation - Contiguous -> Stored in a single block - Linked Allocation -> blocks contain a pointer to the next one (slower access) - Indexed -> Each file has an index block that stores pointers to all its data blocks ### Groups Subdivision of the entire disk or partition Has: - A block bitmap - An inode bitmap - An inode table holding the actual inodes > [!INFO] > Default block group size in ext4 is 128MB ## Journaling Ensure the integrity of the file system by keeping track of changes before they are actually applied to the main file system Phases: - Write-ahead logging -> before any changes are made to the file system - Commit -> shit actually happens - Crash recovery -> we can replay the journal to apply any uncommitted changes Types: - Write-Ahead Logging (WAL) -> logs changes before they are applied to the file system - Metadata journaling -> only metadata is logged. Metadata is restored to a consistent state if crash. - Full journaling -> both ## Example: EXT4 - Journaling - Larger file and volume sizes - Extents -> range of contiguous blocks, reduces fragmentation - Multiblock allocator -> multiple blocks at once - `fsck`, optimized file system check - Pre-allocation - Checksums -> ensure integrity ## Example: Windows FS ### FAT(32) File allocation table. No hard links :C. Directory contains: - File name -> can be up to 8 characters and extension up to 3 - Attributes (one byte) ![](Pasted%20image%2020250505160518.png) - File size -> four byte field for filesize in bytes. Max. 4GB - ID of first block (4 byte) - File size Obviously this is trash since it cannot be used with disk of very large capacities. Windows introduced clustering 4,8,16 blocks together. The table itself is a list of blocks where many links are created and stored. Each entry is 4 bytes. List of empty blocks is also stored. ![](Pasted%20image%2020250505161031.png) Note the reserved blocks. They contain: - Boot sector (VBR) - Bios parameter block - Bootloader code - Sector, cluster size, FAT count, root directory location - FS information Sector (only for FAT32) - Last allocated cluster for speed - Backup boot sector - In case of corruption #### Free blocks list Stores a value for each cluster which can indicate: - `0x00000000` -> Free cluster - Next cluster number -> Cluster is allocated and points to the next one - `0xFFFFFFF8` - `0xFFFFFFFF` -> EOF - `0xFFFFFFF7` -> bad cluster To find a free block we just need to search for the first available cluster. We keep the last allocated cluster, optimizing search time. ### NTFS New Technologies File System. - Everything is a cluster - Size is a multiple of disk block size - Journaling - File data compression ![](Pasted%20image%2020250505161542.png) - Boot sector (VBR) - NTFS signature and other boot info - Location of Master File Table (MFT) - Sector 0 of partition - MFT - Stores metadata for every file and directory - MFT entry that stores attributes - name - size - timestamps - security - MFT itself is described in the MFT lmfao - File system metadata - $MFT, $Bitmap , $LogFile, $Secure, etc. store metadata - System files are treated like regular files - Data - Actual file content, either stored in MFT for small entries or in separate clusters (large files) - Uses extents[^4] and B+ trees[^5] - Supports encryption #### MFT entry Each file or directory is represented by a 1KB entry: - File name - Info (timestamps, perms) - Data location (resident[^6] or not) - Index - Attributes ![](Pasted%20image%2020250505162331.png) ##### `$DATA` - Mft Entry - If the file contains regular data, the `$DATA` attribute stores the file content or the location - For files that fit in a single MFT record (1KB usually) - In-place storage of data (resident) - For larger files, the `$DATA` attribute contains data runs, which are pointers that tell NTFS where the file's data is located on the disk. Typically a sequence of three values - offset/ length byte - Cluster count - Cluster offset ##### Bitmaps - Map of logical clusters in use and not. Same as FAT. ##### Compression Compresses data in 16-cluster chunks. Size of a compression unit (chunk) depends on cluster size: - 4 KB cluster size -> 64 KB compression unit (most common on modern volumes) - 8 KB cluster size -> 128 KB compression unit If a chunk is not compressible to at least 50%, NTFS stores it uncompressed. Uses LZNT1, a variation of (LZ77) ##### Journaling Logs all file system changes in the `$LOGFILE` before applying them. - It can detect bad sectors and mark them in `$BadClus` - NTFS can recover a corrupted MFT using `$MFTMirr` - NTFS uses ACLs to manage permissions - Each file stores a `$SECURITY_DESCRIPTOR` ### Security descriptors ``` Owner: S-1-5-21-3623811015-3361044348-30300820-1001 (User: Alice) Group: S-1-5-32-544 (Administrators) DACL: Allow: S-1-5-21-3623811015-3361044348-30300820-1001 (Alice) - Full Control Deny: S-1-5-21-3623811015-3361044348-30300820-1002 (Bob) - Read Access Allow: S-1-5-18 (Local System) - Full Control SACL: Audit: S-1-5-21-3623811015-3361044348-30300820-1003 (Eve) - Log Failed Access ``` Where DACL = **Discretionary Access Control List** and SACL = **System Access Control List** --- [^1]: **Columnar**, fixed-format ASCII Files have fixed field lengths, as opposed to **delimited**, i.e. fields can be as large as we want them to [^2]: Extension (.pdf, .txt) as opposed to format, which specifies the [grammar](Regular%20languages.md) of the file [^3]: contains information about where to place different parts of the program in memory. [^4]: contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices [^5]: Balanced based on height tree. Nodes can contain multiple keys and pointers. Leaf nodes are the data records, upper nodes only store ketys. Ordered (BST). [^6]: In the MFT entry straight up.