342 lines
9.7 KiB
Markdown
342 lines
9.7 KiB
Markdown
---
|
|
type: theoretical
|
|
backlinks:
|
|
- "[[Memory Management]]"
|
|
---
|
|
|
|
|
|
A file system consists of two parts
|
|
- Collection of files
|
|
- A directory structure -> provides information about all files in the system
|
|
|
|
## File
|
|
- Logical view -> the unit of storing data
|
|
Files are mapped by the OS onto physical nonvolatile devices
|
|
|
|
**Types:**
|
|
- Data
|
|
- Numeric
|
|
- Character
|
|
- Binary
|
|
- Program
|
|
|
|
|
|
**Attributes**:
|
|
- Name
|
|
- Identifier (unique number)
|
|
- Type[^2]
|
|
- Location -> pointer
|
|
- Size
|
|
- Protection (permissions)
|
|
- Datetime and user id
|
|
All of these are stored in **i-nodes**.
|
|
|
|
|
|
### INodes
|
|
- Size in biytes
|
|
- Access permissions
|
|
- Type
|
|
- Creation and last access datetime
|
|
- Owner ID
|
|
- Group ID
|
|
- Hard link count
|
|
### Logical Definition
|
|
- Named collection of related information
|
|
- Files may have free form (text files) or can be rigidly formatted[^1]
|
|
|
|
### Operations
|
|
- Create
|
|
- Write
|
|
- Read
|
|
- Seek (reposition within file)
|
|
- Delete
|
|
- Truncate - shorten or cut off by removing data from the end
|
|
- Open (load to memory)
|
|
- Close (unload)
|
|
|
|
### Open files
|
|
Tracked by an **open-file table**, counted by **file-open count**.
|
|
|
|
In order to [avoid race conditions](Inter-Process%20Communication.md#Avoiding%20race%20conditions), we need to lock the files somehow.
|
|
- **Shared lock** -> several processes can acquire concurrently, used for reads
|
|
- **Exclusive lock** -> writer lock
|
|
- Mandatory vs. advisory -> access is denied depending on locks held and requested vs. processes can find status of locks and decide what to do
|
|
|
|
|
|
### Structure
|
|
Could be many:
|
|
- None
|
|
- Simple record
|
|
- Lines
|
|
- Fixed length
|
|
- Variable length
|
|
- Complex
|
|
- Formatted document
|
|
- Relocatable load file [^3]
|
|
|
|
|
|
## Directories
|
|
Collection of nodes containing information about all files. Also resides on disk.
|
|
|
|
**Operations**:
|
|
- Search for a file
|
|
- Create a file
|
|
- Delete a file
|
|
- List a directory
|
|
- Rename a file
|
|
- Traverse file system
|
|
|
|
### Single level directory
|
|
A single directory for all users.
|
|
|
|
Clearly, we need unique names, which can become a problem real fast. That shit is gonna grow super big.
|
|
|
|
### Two-level directory
|
|
Users have different directories. In Linux -> `/home/user` is separate, allowing for the same file names. Linux, however, uses a multi-level:
|
|
|
|
### Tree-Structured Directories
|
|
- Efficient searching
|
|
- Grouping
|
|
- Absolute v. relative path
|
|
|
|
### Acyclic-Graph
|
|
Have shared subdirectories and files. Symlinks achieve this.
|
|
|
|
### Structure
|
|
In Linux, it is a table (a file) which stores:
|
|
- File name
|
|
- Inode
|
|
|
|
|
|
## Symlinks
|
|
**Hard** vs **Soft**. Hard is a literal copy of the file but keep the same inode info, while soft is just a pointer.
|
|
|
|
>[!IMPORTANT]
|
|
>We only allow links to files to avoid cycles Every time a new link is added we also use a cycle detection algorithm to determine whether it is OK
|
|
## Disk
|
|
Can be subdivided into **partitions**.
|
|
|
|
Disk/partition can be used **raw** (no file system) or can be **formatted**. The entity containing the file system is known as a volume.
|
|
|
|
> [!NOTE]- Typical fs organization
|
|
> 
|
|
|
|
|
|
### Layout
|
|

|
|
|
|
|
|
- **Boot block**
|
|
- Contains initial bootstrap program to load the OS
|
|
- Typically the first sector reads another program from the next few sectors
|
|
- **Super block** - state of the file system
|
|
- Type -> ext3,ext4,FAT, etc.
|
|
- Size -> Number of blocks
|
|
- Block size
|
|
- Block group information -> number of block groups in file system
|
|
- Free block count
|
|
- Free inode count
|
|
- Inode size
|
|
- FS mount info
|
|
- Journal info
|
|
|
|
### Free space management
|
|
Unix uses a bitmap to show free disk blocks. Zero=free, one=in use
|
|
## Access lists and groups
|
|
Read, write and execute.
|
|
Three classes of users on Linux
|
|
1. Owner -> 7 (Read Write Execute)
|
|
2. Group -> 6 (RW)
|
|
3. Public -> 1 (X)
|
|
|
|
|
|
## Blocks
|
|
The IDs of data blocks are stored in [INodes](File%20Systems%20Management.md#INodes), the IDs of the first 12 blocks are stored in direct reference fields.
|
|
|
|

|
|
|
|
|
|
|
|
### Allocation
|
|
- Contiguous -> Stored in a single block
|
|
- Linked Allocation -> blocks contain a pointer to the next one (slower access)
|
|
- Indexed -> Each file has an index block that stores pointers to all its data blocks
|
|
|
|
|
|
### Groups
|
|
Subdivision of the entire disk or partition
|
|
Has:
|
|
- A block bitmap
|
|
- An inode bitmap
|
|
- An inode table holding the actual inodes
|
|
|
|
> [!INFO]
|
|
> Default block group size in ext4 is 128MB
|
|
|
|
## Journaling
|
|
Ensure the integrity of the file system by keeping track of changes before they are actually applied to the main file system
|
|
|
|
Phases:
|
|
- Write-ahead logging -> before any changes are made to the file system
|
|
- Commit -> shit actually happens
|
|
- Crash recovery -> we can replay the journal to apply any uncommitted changes
|
|
|
|
Types:
|
|
- Write-Ahead Logging (WAL) -> logs changes before they are applied to the file system
|
|
- Metadata journaling -> only metadata is logged. Metadata is restored to a consistent state if crash.
|
|
- Full journaling -> both
|
|
|
|
|
|
## Example: EXT4
|
|
- Journaling
|
|
- Larger file and volume sizes
|
|
- Extents -> range of contiguous blocks, reduces fragmentation
|
|
- Multiblock allocator -> multiple blocks at once
|
|
- `fsck`, optimized file system check
|
|
- Pre-allocation
|
|
- Checksums -> ensure integrity
|
|
|
|
## Example: Windows FS
|
|
### FAT(32)
|
|
File allocation table.
|
|
|
|
No hard links :C. Directory contains:
|
|
- File name -> can be up to 8 characters and extension up to 3
|
|
- Attributes (one byte)
|
|

|
|
|
|
- File size -> four byte field for filesize in bytes. Max. 4GB
|
|
- ID of first block (4 byte)
|
|
- File size
|
|
|
|
Obviously this is trash since it cannot be used with disk of very large capacities. Windows introduced clustering 4,8,16 blocks together.
|
|
|
|
The table itself is a list of blocks where many links are created and stored. Each entry is 4 bytes. List of empty blocks is also stored.
|
|
|
|

|
|
|
|
|
|
|
|
Note the reserved blocks. They contain:
|
|
- Boot sector (VBR)
|
|
- Bios parameter block
|
|
- Bootloader code
|
|
- Sector, cluster size, FAT count, root directory location
|
|
- FS information Sector (only for FAT32)
|
|
- Last allocated cluster for speed
|
|
- Backup boot sector
|
|
- In case of corruption
|
|
|
|
|
|
#### Free blocks list
|
|
Stores a value for each cluster which can indicate:
|
|
- `0x00000000` -> Free cluster
|
|
- Next cluster number -> Cluster is allocated and points to the next one
|
|
- `0xFFFFFFF8` - `0xFFFFFFFF` -> EOF
|
|
- `0xFFFFFFF7` -> bad cluster
|
|
|
|
To find a free block we just need to search for the first available cluster. We keep the last allocated cluster, optimizing search time.
|
|
### NTFS
|
|
New Technologies File System.
|
|
|
|
- Everything is a cluster
|
|
- Size is a multiple of disk block size
|
|
- Journaling
|
|
- File data compression
|
|
|
|

|
|
|
|
|
|
- Boot sector (VBR)
|
|
- NTFS signature and other boot info
|
|
- Location of Master File Table (MFT)
|
|
- Sector 0 of partition
|
|
- MFT
|
|
- Stores metadata for every file and directory
|
|
- MFT entry that stores attributes
|
|
- name
|
|
- size
|
|
- timestamps
|
|
- security
|
|
- MFT itself is described in the MFT lmfao
|
|
- File system metadata
|
|
- $MFT, $Bitmap , $LogFile, $Secure, etc. store metadata
|
|
- System files are treated like regular files
|
|
- Data
|
|
- Actual file content, either stored in MFT for small entries or in separate clusters (large files)
|
|
- Uses extents[^4] and B+ trees[^5]
|
|
- Supports encryption
|
|
|
|
#### MFT entry
|
|
Each file or directory is represented by a 1KB entry:
|
|
- File name
|
|
- Info (timestamps, perms)
|
|
- Data location (resident[^6] or not)
|
|
- Index
|
|
- Attributes
|
|

|
|
|
|
|
|
|
|
##### `$DATA`
|
|
- Mft Entry
|
|
- If the file contains regular data, the `$DATA` attribute stores the file content or the location
|
|
- For files that fit in a single MFT record (1KB usually)
|
|
- In-place storage of data (resident)
|
|
- For larger files, the `$DATA` attribute contains data runs, which are pointers that tell NTFS where the file's data is located on the disk. Typically a sequence of three values
|
|
- offset/ length byte
|
|
- Cluster count
|
|
- Cluster offset
|
|
|
|
##### Bitmaps
|
|
- Map of logical clusters in use and not. Same as FAT.
|
|
|
|
##### Compression
|
|
Compresses data in 16-cluster chunks.
|
|
Size of a compression unit (chunk) depends on cluster size:
|
|
- 4 KB cluster size -> 64 KB compression unit (most common on modern volumes)
|
|
- 8 KB cluster size -> 128 KB compression unit
|
|
If a chunk is not compressible to at least 50%, NTFS stores it uncompressed.
|
|
|
|
Uses LZNT1, a variation of (LZ77)
|
|
|
|
##### Journaling
|
|
Logs all file system changes in the `$LOGFILE` before applying them.
|
|
- It can detect bad sectors and mark them in `$BadClus`
|
|
- NTFS can recover a corrupted MFT using `$MFTMirr`
|
|
- NTFS uses ACLs to manage permissions
|
|
- Each file stores a `$SECURITY_DESCRIPTOR`
|
|
### Security descriptors
|
|
|
|
|
|
```
|
|
Owner: S-1-5-21-3623811015-3361044348-30300820-1001 (User: Alice)
|
|
Group: S-1-5-32-544 (Administrators)
|
|
DACL:
|
|
Allow: S-1-5-21-3623811015-3361044348-30300820-1001 (Alice) - Full Control
|
|
Deny: S-1-5-21-3623811015-3361044348-30300820-1002 (Bob) - Read Access
|
|
Allow: S-1-5-18 (Local System) - Full Control
|
|
SACL:
|
|
Audit: S-1-5-21-3623811015-3361044348-30300820-1003 (Eve) - Log Failed
|
|
Access
|
|
```
|
|
|
|
Where DACL = **Discretionary Access Control List** and SACL = **System Access Control List**
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
[^1]: **Columnar**, fixed-format ASCII Files have fixed field lengths, as opposed to **delimited**, i.e. fields can be as large as we want them to
|
|
|
|
[^2]: Extension (.pdf, .txt) as opposed to format, which specifies the [grammar](Regular%20languages.md) of the file
|
|
|
|
[^3]: contains information about where to place different parts of the program in memory.
|
|
|
|
[^4]: contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices
|
|
|
|
[^5]: Balanced based on height tree. Nodes can contain multiple keys and pointers. Leaf nodes are the data records, upper nodes only store ketys. Ordered (BST).
|
|
|
|
[^6]: In the MFT entry straight up.
|