A Generic Storage Interface
Websites often have a lot of different assets and files for the various areas of a website – content management systems, photo galleries, e-commerce product photos, etc. As a site grows, so does storage demand and backup requirements, and as storage demands grow it typically becomes necessary to distribute those files across multiple servers or services.
One method for managing disparate file systems is to use custom PHP stream wrappers and configurable paths; but some extensions don't yet support custom wrappers for file access. An alternative that I've been using is an object and service-oriented approach to keep my application code independent from the storage configuration.
Interface
At the core of my design, is the asset storage interface which looks something like:
interface StorageEngineInterface {
// store a file and return back a token that can be used to retrieve it
function store(SplFileInfo $file);
// retrieve a locally-accessible SplFileInfo based on the token
function retrieve($token);
// remove data from storage based on the token
function purge($token);
}
The storage engine is responsible for generating a reusable token that can be used for later retrieval. Generally, I simply have it generate a UUID as the token, however tokens could have storage-specific meaning.
Sample Storage Engines
I've used several base implementations:
LocalStorageEngine
– the simplest storage using a local/NFS filesystemAWSS3StorageEngine
– using AWS S3 for storageSftpStorageEngine
– using PHP's ssh2 module to access files on servers via SFTPAtlassianConfluenceStorageEngine
– managing documents within Confluence wikis
Remote services like AWS S3 and SFTP can cause significant performance issues. To help with that, I use a CachedStorageEngine
implementation. It accepts two StorageEngineInterface
arguments: one as the upstream engine, and one as the local cache. For example:
new CachedStorageEngine(
new AWSS3StorageEngine(new Aws\S3\S3Client(...), 'bucket.example.com', 'my-prefix'),
new LocalStorageEngine('/tmp/s3-bucket.example.com-cache')
);
And since CachedStorageEngine
is just another implementation of StorageEngineInterface
, it can be used interchangeably within the application with performance being the only difference.
Application Usage
Using dependency injection, each of the storage backends becomes an independent service, configured depending on the application requirements. The application then has no storage-specific calls like copy
, file_get_contents
, fopen
, etc and the code looks something like:
// storage service for photos
$storage = $dic->get('photo_storage')
// save a new photo
$photo = new PhotoRecord();
$photo->setAssetToken(
$storage->store($request->files->get('upload'))
);
// use the photo
$image = (new Imagine\Gd\Imagine())->open(
$storage->retrieve($photo->getAssetToken())
);
// delete the photo
$storage->purge($photo->getAssetToken());
$photo->delete();
Since retrieve
will always return a SplFileInfo
instance, it can be referenced and handled like a local file (as demonstrated by the open
call in the example.
Complicating Things
The asset storage interface itself is fairly primitive, but it allows for some more complex configurations:
- by using dependency injection, it becomes extremely easy to switch storage engines since application code doesn't need to change
- complex storage rules can be combined with meaningful tokens to, for example, store very large files on different disks and using a token prefix to identify that class
- creating a fallback storage class which will go through a chain of storages searching until it's able to store or retrieve a token
- internally deferring operations via queue manager (e.g. instead of storing files immediately to S3 and waiting for upload time, write it locally and create a job to upload it in the background)
Summary
By abstracting storage logic outside of my application code, it makes my life much more easier as a developer and as a systems administrator when trying to manage where files are located and any relocations, as necessary.