A significant element of digital forensics is the classification of file fragments. Researchers have used several approaches to classify file fragments without using meta-data. Recently, deep learning algorithms, including convolutional neural networks and feed-forward neural networks, have been used to build classification models for this task. This paper proposes a depthwise separable convolutional neural network-based model for the efficient classification of file fragments. Our proposed model's evaluation results are faster and more accurate than state-of-the-art models on 75 file fragment types. In particular, our model achieves an accuracy of 78.45\% on the FFT-75 dataset with 100K parameters and 167M FLOPs, which is 24x faster and 4-5x smaller than the state-of-the-art classifier in the literature.

Efficient File Fragments Classification using Depthwise Separable Convolutions
Abstract
Technologies:
- - Python
- - Pytorch
- - Keras
- - Numpy
- - Matplotlib
- - LaTeX
Inception Depthwise Separable Conv Block
Model consists of multiple Inception Depthwise Separable blocks followed by 1x1 Conv for classification.
.png)
Dataset
We used FFT-75 dataset that composed of 75 types of files that are organized into 6 different scenarios and variants with 512 and 4096-byte blocks.
Results on FFT-75 dataset Scenario 1 (all 75 classes)
Model | Neural Network | Block Size | # Params | Accuracy | Speed [ms/block] | Speed [min/GB] |
---|---|---|---|---|---|---|
Our Model | Depthwise Separable CNN | 4096 | 103,083 | 78.45 | 2.65 | 0.055 |
512 | 103,083 | 65.89 | 2.78 | 0.382 | ||
FiFTy | 1-D CNN | 4096 | 449,867 | 77.04 | 38.189 | 1.366 |
512 | 289,995 | 65.66 | 38.67 | 3.052 |
Comparison between FiFTy and our model for floating point operations (Mega FLOPs)
Scenario ) | Fragment Size | FLOPs (ours) | FLOPs (FiFTy) |
---|---|---|---|
1 | 4096 | 167.83 | 1047.59 |
512 | 21.00 | 1801.71 | |
2 | 4096 | 167.82 | 1327.90 |
512 | 20.99 | 918.06 | |
3 | 4096 | 167.81 | 647.78 |
512 | 20.99 | 3579.57 | |
4 | 4096 | 167.81 | 2378.52 |
512 | 20.98 | 1576.71 | |
5 | 4096 | 167.81 | 488.37 |
512 | 20.98 | 2330.48 | |
6 | 4096 | 167.81 | 1126.00 |
512 | 20.98 | 611.30 |
Confusion Matrix for Scenario 1 (4096 and 512 bytes block)

