A survey of data modeling
There are many different ways of modeling data. They all have their place, and all have places where they are a poor fit.
The spectrum of options below are defined mainly by the degree to which they differentiate between read and write models, and correspondingly how powerful-but-also-complex they are. "Model" in this case usually corresponds to a class, or a class with one or more composed classes.
In this case, there is no formal data definition beyond the SQL (or other database) schema. The application just runs arbitrary SQL queries, both read and write, wherever it sees fit.
In a slightly better variant, SQL queries are all confined to selected objects that act as an API to the rest of the application. Arbitrary code does not call SQL, but it can call a method on this object that will call SQL.
The SQL could be hand-crafted, use a query builder of one kind or another, or a little of each.
This approach may work at a very small scale, where building something more formal isn't worth the effort. However, the tipping point where it is worth the effort comes very, very early.
The most widely used approach is known as "Create Read Update Delete" (CRUD). Those are the four standard operations. In this case, the system models a series of data objects called Entities. While technically Entities do not need to correspond 1:1 to a particular database table, in practice that is often the case. An entity could also have dependent tables, the details of which are mostly hidden.
CRUD is usually managed by an ORM, or Object-Relational Mapper. The ORM attempts to hide all SQL logic from the user, providing a consistent interface pattern. A user Reads (loads) an Entity by ID, possibly Updates it (edits some value), and then saves it back to the database. The user only interacts with the Entity object.
There are two main variants of ORM: Active Record, in which the Entity object has direct access to the database connection to load and save itself, and Data Mapper, in which the Entity is ignorant of its storage and a separate service (a mapper, or repository, or various other names) is responsible for the loading and saving. Active Record is often easier to implement from scratch, so it is popular with RAD-oriented tools (like Ruby on Rails or Laravel). It is, however, a vastly inferior design as it severely hinders testing, encapsulation, and more advanced cases. The effort to set up a Data Mapper is almost always worth it, as the effort is not substantially higher for a skilled developer.
CRUD falls down in three key areas:
An ORM in concept also does not offer any native way to create compound views, showing a subset of fields from 3 different related entities, for example. Some ORMs provide a mechanism of some sort, but rarely are they as capable or efficient as just writing SQL.
The impedance mismatch between object models and relational models has been called "The Vietnam of Computer Science," meaning "you keep trying to do it more, and it just gets worse the more you do." Simple ORMs are straightforward to build, but have an upper bound on complexity before they become too unwieldy.
There is a variant of CRUD known as Create Read Archive Purge (CRAP), which does not get anywhere near as much use as it should. In this approach, each Entity is not updated in place when modified. Instead, an entirely new copy of the Entity is stored in the database, along with some version identifier. That gives each Entity a history of its state over time, with a built in ability to review that history and revert to an earlier state.
No Entity is deleted; if an entity needs to be deleted, a new version of it is saved that has a "deleted" flag set to true. Any SQL that interacts with the Entity must then be written to exclude older versions and deleted versions, unless specifically instructed not to.
If the historical data of a given Entity is no longer valuable, or is not valuable after a period of time, a separate Purge command can remove old revisions, including removing deleted entities entirely. The time frame for such purges and whether they can be user-triggered varies with the implementation.
The advantage is, of course, the history and rollback ability. It's also relatively easy to extend it to include forward revisions, which are revisions that will become the active revision at some point in the future (either upon editorial approval or some time trigger).
The downside is the extra tracking required, which means every bit of SQL that interacts with a CRAP Entity needs to be aware of its CRAPpiness. Writing arbitrary custom SQL becomes more problematic in this case, as a query that forgets to account for old revisions or deleted entities could result in unexpected data. That is especially true with more complex relationships. It also implies questions like "should Entity A getting a new revision cause Entity B to get a new revision, too? Should Entity A point to Entity B, or a specific revision of Entity B?" All possible answers to those questions are valid in some situations but not others. There may also be performance considerations if there are many revisions of many Entities, although that is a solvable problem with smart database design.
Nonetheless, I would argue CRAP is still superior to CRUD in most editorial-centric environments (news websites, company sites, etc.).
An extension available to both CRUD and CRAP is Projections. Usually Projections are discussed in the context of CQRS or EventSourcing (see below), but there's no requirement that they only be used there.
A Projection is the fancy name for stored data that is derived from other stored data. When the primary data is updated, an automated process causes the projection to be updated as well. That automation could be in application logic or SQL triggers/stored procedures; I would even consider an SQL View (either virtual or materialized) to be a form of Projection.
Projections are useful when you want the read version of the data structured very differently than the write version, or want it presented in some way that is expensive to compute on-the-fly.
For example, if you want a list of all sales people, their weekly sales numbers, the percentage change from last week, ordered by sales numbers, that could be expensive to compute on the fly. It could also be complex, if that data has to be derived from individual sale records and those sale records are spread across multiple tables, and sales team information is similarly well normalized across multiple tables. Instead, either on a schedule or whenever a sale record is updated, some process can compute that data (either the whole table or just update the one record it needs to) and save it to a sales_leaderboard
table. Viewing that information is then a super simple, super fast single-table SELECT query.
If that table ever becomes corrupted or out of date, or we just want to change its structure, the data can just be wiped and rebuilt from the existing primary data. Projections are always expendable. If not, they're not Projections.
A system can use many or a few Projections as needed, built in a variety of ways. As usual, there's more than one way to feed a cat. If heavily used, Projections form essentially the entire read model. There's no need to read Entities from the primary data, except for update purposes.
Technically, any search index (Elasticsearch, Solr, Meilisearch, etc.) is a Projection. There is no requirement that the Projection even be in SQL, just that it is expendable, rebuildable data in a form that is optimized for how it's going to be read.
The next level in read/write separation is Command Query Responsibility Segregation** (CQRS). CQRS works from the assumption that the read and write models are always separate.
Often, though not always, the write models are structured as command objects rather than as an Entity per se. That could be low-level (UpdateProduct
command with the fields to change) or high-level (ApprovePost
with a post ID).
The read models could be structured in an Entity-like way, but do not have to be. CQRS does not require using Projections, though they do fit well.
The advantage of CQRS is, of course, the flexibility that comes with having fully independent read and write models. That allows using the type system to enforce write invariants while having completely separate immutable read models. It also allows separating both read and writes from the underlying Entity definitions; a single update command may impact multiple entities, and a read/lookup can easily span entities.
The downside of CQRS is the added complexity that keeping track of separate read and write models entails. It requires great care to ensure you don't end up with a disjointed mess. Martin Fowler recommends only using it within one Bounded Context rather than the system as a whole (though he does not go into detail about what that means). If the read and write models are "close enough," CRUD with an occasional Projection may have less conceptual overhead to manage.
The most aggressive separation between read and write models is Event Sourcing. In Event Sourcing, there is no stored model. The primary data that gets written is just a history of "Events" that have happened. The entire data store is just a history of event objects, with some indexing support.
When loading an object (or "Aggregate" in Event Sourcing speak), the relevant Events are loaded from the store and a "current status" object is built on-the-fly and returned. In practice, in a well-designed system this process can be surprisingly fast. The Event stream also acts as a built-in log of all actions taken, ever.
Event Sourcing also leans very heavily on Projections. Projections can represent the current state of the system as of the most recent event, in whatever form is desired. Storing an Event can trigger a handler that updates Projections, sends emails, enqueues jobs, or anything else.
Importantly, events can be replayed. That means, for example, creating a new Projection requires only writing the routine that creates the projection, then rerunning the entire Event stream on it. It will then build the Projection appropriately. If the Projection is updated, migrating a projected database table is simple: Delete the old one, create the new one, rerun the Event stream. Every database table, search index, etc. except for the Event stream itself are disposable and can be thrown out and recreated at will.
The downside is that Event Sourcing, like CQRS, requires careful planning. It's a very different mental model, and not a good fit for all situations. Banking is the classic example of where it fits, and where a history of actions taken is the most important data. A typical editorial CMS, however, would be a generally poor fit for Event Sourcing, as most of what it's doing is very CRUD-ish. Nearly all events would be some variation on PostUpdated
.
Depending on the complexity of the data, building reasonable Projections could be a challenge. If Entities/Aggregates are loaded from the Event stream, they may be easy or complex to reconstitute.
(This section is, of course, quite subjective.)
All of these models have their trade-offs, and pros/cons. For most standard applications, I would argue that CRUD-with-Projections is the least-bad approach. The ecosystem and known best practices are well established. Edge cases where the read and write models need to differ can often be handled as one-offs, if the system is designed with that in mind. That sort of edges it into CQRS space in limited areas, which is both helpful and risky if viewed as a slippery slope.
Even in a CRUD-based approach, it's possible to have slightly different objects for read and write. If the language supports it, the read objects can be immutable, while the write objects are mutable aside from select key fields (primary key, last-updated timestamp, etc.), which may even be omitted. The line between this split-CRUD approach and CQRS is somewhat fuzzy, though, so be mindful that you don't over-engineer CRUD when you should just use CQRS.
For workflow-heavy applications (like change-approval, or scheduled publishing, etc.), CRAP is likely worth the effort. The ability to have forward and backward revisions greatly simplifies many workflow approaches, and provides a nice audit trail.
Regardless of the approach chosen, it is virtually always worth the effort to define formal, well-typed data objects in your application to represent the models. Using anonymous objects, hashes, or arrays (depending on the language) is almost always going to cause maintenance issues sooner rather than later. Even if using CQRS or just queries that bypass a CRUD ORM, every set of records read from the database should be mapped into a well-typed defined object. That is inherently self-documenting, eliminates (or at least highlights as needing attention) many edge cases, provides a common, central place for in-memory handling of those edge cases (eg, null handling), and so forth.
Additionally, any database interaction should be confined to select, dedicated services whose have exclusive responsibility for interacting with the database and turning results into proper model objects. This is true regardless of the model used.
If doing CRUD or CQRS, it may be tempting to optimize updates to only update individual fields that need updating rather than updating an entire Entity at once, including unnecessary fields. I would argue that, in most cases, this is a waste of effort. Modern SQL databases are quite fast and almost certainly smarter than you are when it comes to performance. If you are using a well-established ORM that already does that, it's fine, but if rolling your own the effort involved is rarely worth it. At that point, you're almost merging CRUD and CQRS commands anyway.
Crell/Serde 1.5 released
It's amazing what you can do when someone is willing to pay for the time!
There have been two new releases of Crell/Serde recently, leading to the latest, Serde 1.5. This is an important release, not because of how much is in it but what major things are in it.
That's right, Serde now has support for union, intersection, and compound types! And it includes "array serialized" objects, too.
mixed
fieldsA key design feature of Serde is that it is driven by the PHP type definitions of the class being serialized/deserialized. That works reasonably well most of the time, and is very efficient, but can be a problem when a type is mixed
. When serializing, we can just ignore the type of the property and use the type of the value. Easy enough. When deserializing, though, what do you do? In order to support non-normalized formats, like streaming formats, the incoming data is opaque.
The solution is to allow Deformatters to declare, via an interface, that the can derive the type of the value for you. Not all Deformatters can do that, depending on the format, but all of the array-oriented Deformatters (json
, yaml
, toml
, array
) are able to, and that's the lion's share of format targets. Then when deserializing, if we hit a mixed
field, Serde delegates to the Deformatter to tell it what the type is. Nice.
Sometimes that's not enough, though. Especially if you're trying to deserialize into a typed object, just knowing that the incoming data is array-ish doesn't help. Serde 1.4 therefore introduced a new type field for mixed
values: #[MixedField]
. MixedField
takes one argument, $suggestedType
, which is the object type that should be used for deserialization. If the Deserializer says the data is an array, then it will be upcast to the specified object type.
class Message
{
public string $message;
#[MixedField(Point::class)]
public mixed $result;
}
When serializing, the $result
field will serialize as whatever value it happens to be. When deserializing, scalars will be used as is while an array will get converted to a Point
class.
PHP has supported union types since 8.0, and intersection types since 8.1, and mixing the two since 8.2. But they pose a similar challenge to serialization.
The way Serde 1.5 now handles that is to simply fold compound types down to mixed
. As far as Serde is concerned, anything complex is just "mixed," and we just defined above how that should be handled. That's... remarkably easy. Neat.
If the type is a union, specifically, then there's a little more we can do.
First, if a union type doesn't specify a suggestedType
but the value is array-ish, it will iterate through the listed types and pick the first class or interface listed. That won't always be correct, but since the most common union type will likely be something like string|array
or string|SomeObject
, it should be sufficient in most cases. If not, specifying the $suggestedType
explicitly is recommended.
Second, a separate #[UnionField]
attribute extends MixedField
and adds the ability to specify a nested TypeField
for each of the types in the list. The most common use for that would be for an array, like so:
class Record
{
public function __construct(
#[UnionField('array', [
'array' => new DictionaryField(Point::class, KeyType::String)]
)]
public string|array $values,
) {}
}
In this case, if the deserialized value is a string
, it gets read as a string. If it's an array
, then it will be read as though it were an array field with the specified #[DictionaryField]
on it instead. That allows upcasting the array to a list of Point
objects (in this case), and validating that the keys are strings.
Another unrelated but very cool fix is a long-standing bug when flattening array-of-object properties. Previously, their type was not respected. Now it is. What that means in practice is you can now do this:
[
{x: 1, y: 2},
{x: 3, y: 4},
]
class PointList
{
public function __construct(
#[SequenceField(arrayType: Point::class)]
public array $points,
) {}
}
$json = $serde->serialize($pointList, format: 'json');
$serde->deserialize($json, from: 'json', to: PointList::class);
Boom. Instant top-level array. Previously, this behavior was only available when serializing to/from CSV, which had special handling for it. Now it's available to all formats.
Because compound types were only introduced in PHP 8.2, Serde 1.5 now requires PHP 8.2 to run. It will not run on 8.1 anymore. Technically it would have been possible to adjust it in a way that would still run on 8.1, but it was a hassle, and according to the Packagist stats for Crell/Serde the only PHP 8.1 user left is my own CI runner. So, yeah, this shouldn't hurt anyone. :-)
These improvements were sponsored by my employer, MakersHub. Quite simply, we needed them, so I added them. One of the advantages of eating your own dogfood: You have an incentive to make it better.
Is your company using an OSS library? Need improvements made? Sponsor them. Either submit a PR yourself or contract the maintainer to do so, or just hire the maintainer. All of this great free code costs time and money to make. Kudos to those companies that already do sponsor their Open Source tool chain.
Mildly Dynamic websites are back
I am pleased to report that my latest side project, MiDy, is now available for alpha testing!
MiDy is short for Mildly Dynamic. Inspired by this blog post, MiDy tries to sit "in between" static site generators and full on blogging systems. It is optimized for sites that are mostly static and only, well, "mildly dynamic." SMB websites, blogs, agency sites, and other use cases where frankly, 90% of what you need is markdown files and a template engine... but you still need that other 10% for dynamic listings, form submission, and so on.
MiDy offers four kinds of pages:
The README covers more details, though as it's still at version 0.2.0 the documentation is still a work in progress. And of course, it's built for PHP 8.4 and takes full advantage of many new features of the language, like property hooks and asymmetric visibility. Naturally.
I will be converting this site over to MiDy soon. Gotta dog-food my own site, of course. (And finally get rid of Drupal.)
While I wouldn't yet recommend it as production ready, it's definitely ready for folks to try out and give feedback on, and to run test sites or personal sites on. I don't expect any API changes that would impact content at this point, but like I said, it's still alpha so caveat developor.
If you have feedback, please either open an issue or reach out to me on the PHPC Discord server. If you want to send a PR of your own, please open an issue first to discuss it.
I'll be posting more blog posts on MiDy coming up. Whether before or after I move this site to it, we'll see. :-)
Self hosted photo albums
I've long kept my photo backups off of Google Cloud. I've never trusted them to keep them safe, and I've never trusted them to not do something with them I didn't want. Like, say, ingest them into AI training without telling me. (Which, now, everyone is doing.) Instead, I've backed up my photos to my own Nextcloud server, manually organized them, and let them get backed up from there.
More recently, I've decided I really need a proper photo album tool to carry around "wallet photos" of family and such to show people. A few years back I started building my own application for that in Symfony 4, but I ran into some walls and eventually abandoned the effort. This time, I figured I'd see what was available on the market for self-hosted photo albums for me and my family to use.
Strap yourself in, because this is a really depressing story (with a happy ending, at least).
I reviewed 7 self-hosted photo album tools, after checking various review sites for their top-ten lists. Of those 7:
Let's have a look at the mess directly.
Language: TypeScript License: MIT
PiGallery 2 is intended as a light-weight, directory-based photo album. The recommended way to install it is to use their Docker compose file and nginx conf file... which you have to just manually copy out of Git. (Seriously?) And when I tried to get that to run locally, I could never connect to it successfully. There was something weird with the port configuration, and I wasn't able to quickly figure it out. If I can't get the "easy" install to work, I'm not interested.
Language: PHP/MySQL License: GPLv2
Unlike many on here it doesn't provide a Docker image, which is fine so I set one up using phpdocker.io. Unfortunately, it's net installer crashed when I tried to use it, without useful errors. Trying to install manually resulted in PHP null-value errors from the install script. When I looked at the install script, I found dozens upon dozens of file system operations with the @
operator on them to hide errors.
At that point I gave up on Piwigo.
Language: PHP/MySQL License: GPL, version unspecified
When I first visted the Coppermine website, I got an error that their TLS certificate had expired a week and a half before. How reassuring.
Skipping past that, I was greeted with a website with minuscule text, with a design dating from the Clinton presidency. How reassuring.
Right on the home page, it says Coppermine is compatible all the way down to PHP 4.2, and supposedly up to 8.2. For those not familiar with PHP, 4.2 was released in 2002, only slightly after the Clinton presidency. PHP has evolved, um, a lot in 22 years, and most developers today view PHP 4 as an embarrassment to be forgotten. If their code is still designed to run on 4.2, it means they're ignoring literally 20 years of language improvements, including security improvements. How reassuring.
Oh, and the installation instructions, linked in the menu, are a direct link to some random forum post from 2017. How reassuring.
At this point I was so reassured that I Noped right out and didn't even bother trying to install it.
Language: JavaScript. (Not TypeScript, raw JS as far as I can tell.) License: None specified.
Although this app showed up on a few top-ten lists, its license is not specified, and installation only offers Windows and Mac. (Really?) The "others" section eventually lets you get to an Ubuntu section, where their recommendation is to install it via... an Apt remote. Which is an interesting choice.
It has a GitHub repo, but that has no license listed at all. Which technically means it's not licensed at all, and so downloading it is a felony. (Yes, copyright law is like that.)
Being a good Netizen, I reached out to the company through their Contact form to ask them to clarify. They eventually responded that, despite some parts of the code being in public GitHub repos, none of it is Open Source.
Noping right out of that one.
Language: Go License: It's complicated
I actually managed to get this one to run! This one also "installs" via Docker Compose, but it actually worked. This is the only one of the apps I reviewed that I could get to work. Mind you, as a Go app I cannot fathom why it needs a container to run, since Go compiles to a single binary.
Their system requirements are absurdly high. Quoting from their site, "you should host PhotoPrism on a server with at least 2 cores, 3 GB of physical memory,1 and a 64-bit operating system." What the heck are they doing? It's Go, not the JVM.
In quick experimentation, it seemed decent enough. The interface is snappy and supports uploading directly from the browser.
However, I then ran into a pickle. The GitHub repository says the license is AGPL, which I am fine with. However, in the app itself is a License page that is not even remotely close to Free Software anything, listing mainly all the ways you cannot modify or redistribute the code.
I filed an issue on their repository about it, and got back a rather blunt comment that only the "Community Edition" is AGPL, which is a different download. The supported version is not.
Noping right out of this one, too.
Language: Go, with TypeScript for front-end License: AGPLv3
Another app wants you install via Docker Compose. And when I tried to do so, I got a bunch of errors about undefined environment variables. The install documentation says nothing about setting them, and it's not clear how to do so, so at this point I gave up.
Language: PHP License: MIT
Lychee is built with Laravel, which I don't care for but I have used very good Laravel-based apps in the past so I had high hopes. It talks about using Docker, but unlike the others here doesn't provide a docker-compose file, just some very long Docker run commands.
Their primary instructions are to git-clone the project, then run composer install
and npm
. Unfortunately, phpdocker.io is still built using Ubuntu 22.04, which has an ancient version of npm in it, and I didn't want to bother trying to figure out how to upgrade it.
Lychee did offer a demo container, which uses SQLite. That I was able to get to run successfully. However, for unclear reasons it wouldn't actually show any images.
At this point, I gave up.
Rather disappointed in the state of the art, I decided to take a different approach. As I mentioned, I use Nextcloud to store all my images. Nextcloud has a photo app, but the last time I used it, it was very basic, and pretty bad. That was a few years ago, though, so I went searching.
Turns out, not only has Nextcloud Photos improved considerably, there's also another extension app on it called Memories. On paper, it looks like it does everything I'm after. A timeline feed, custom albums that don't require duplicating files, you can edit the Exif data of the image to show a title and description, plus some fancy extras like mapping geo information to OpenStreetMap and AI-based tagging, if you have the right additional apps installed. So would it work?
Turns out... yes. The setup was slightly fiddly, but mostly because it took a while to download all the map data and index a half-million photos. Once it did that, though... it just worked. It does almost everything I was looking for. I haven't figured out how to reorder albums or pictures within an album, and it looks like it doesn't support sub-albums. But otherwise, it does what I need. It even has a mobile app (free) that let's me show off selected pictures on my phone, which is what I was ultimately after.
I have always had a love/hate relationship with Nextcloud. In concept, I love it. Self-hosted file server and application hub? Sign me up! Despite being a PHP dev of 25 years, I've never quite understood why PHP made sense for it, though. And upgrades have always been a pain, and frequently break. But its functionality is just so useful. Apps are hit or miss, ranging from first-rate (like Memories) to meh.
But in this case, it ended up being both the cleanest and most capable option, as well as the easiest to get going, provided I already had a Nextcloud server. So, solution found. I am now a Memories user, and will be setting up accounts for the rest of the family, too.
Property hooks in practice
Two of the biggest features in the upcoming PHP 8.4 are property hooks and asymmetric visibility (or "aviz" for short). Ilija Tovilo and I worked on them over the course of two years, and they're finally almost here!
OK, so now what?
Rather than just reiterate what's in their respective RFCs (there are many blog posts that do that already), today I want to walk through a real-world application I'm working on as a side project, where I just converted a portion of it to use hooks and aviz. Hopefully that will give a better understanding of the practical benefits of these tools, and where there may be a rough edge or two still left.
One of the primary use cases for hooks is to not use them: They're there in case you need them, so you don't need to make boilerplate getter/setter methods "just in case." However, that's not their only use. They're also really nice when combined with interface properties, and delegation. Let's have a look.
Continue reading this post on PeakD
Tukio 2.0 released - Event Dispatcher for PHP
I've just released version 2.0 of Crell/Tukio! Available now from your favorite Packagist.org. Tukio is a feature-complete, easy to use, robust Event Dispatcher for PHP, following PSR-14. It began life as the PSR-14 reference implementation.
Tukio 2.0 is almost a rewrite, given the amount of cleanup that was done. But the final result is a library that is vastly more robust and vastly easier to use than version 1, while still producing near-instant listener lookups.
Some of the major improvements include:
listener()
and listenerService()
, both of which should be used with named arguments for maximum effect. The old API methods are still supported, but deprecated to allow users to migrate to the new API.Continue reading this post on PeakD.
Cutting through the static
Static methods and properties have a storied and controversial history in PHP. Some love them, some hate them, some love having something to fight about (naturally).
In practice, I find them useful in very narrow situations. They're not common, but they do exist. Today, I want to go over some guidelines on when PHP developers should, and shouldn't, use statics.
In full transparency, I will say that the views expressed here are not universal within the PHP community. They do, however, represent what I believe to be the substantial majority opinion, especially among those who are well-versed in automated testing.
Continue reading this post on PeakD.
Announcing Crell/Serde 1.0.0
I am pleased to announce that the trio of libraries I built while at TYPO3 have now reached a fully stable release. In particular, Crell/Serde is now the most robust, powerful, and performant serialization library available for PHP today!
Serde is inspired by the Rust library of the same name, and driven almost entirely by PHP Attributes, with entirely pure-function object-oriented code. It's easy to configure, easy to use, and rock solid.
For a full overview, I gave a presentation at Longhorn PHP 2023 that went into its capabilities in detail. Even then, I didn't have time to cover everything! Have a look at the README for a complete list of all the options and features available.
Serde is backed by two other libraries:
Give all three a try, and see how powerful modern PHP has become!
Technical debt is over-used
The term "technical debt" gets thrown around a lot. Way too much, in fact. Part of that is because it has become a euphemism for "code I don't like" or "code that predates me." While there are reasons to dislike such code (both good and bad), that's not what the term "technical debt" was invented to refer to.
So what does it mean? There's several different kinds of "problematic code," all of which come from different places.
Continue reading this post on PeakD.
Using PSR-3 placeholders properly
In the last 2 years or so, I've run into a number of projects that claim to use the PSR-3 logging standard as published by the PHP Framework Interoperability Group (PHP-FIG, or just FIG). Unfortunately, it's quite clear that those responsible for the project have not understood PSR-3 and how it is intended to work. This frustrates me greatly, as PSR-3's design addresses a number of issues that these projects are not benefiting from, and it reduces interoperability between projects (which was the whole point in the first place).
Rather than just rant angrily online (fun as it is, it doesn't actually accomplish anything), many of my PHP community colleagues encouraged me to blog about using PSR-3 properly. So, here we are.
If you just want the final point, here it is.
If you're writing this:
$logger->info("User $userId bought $productName");
Then you're doing it wrong, abusing PSR-3, and may have a security attack vector. You need to switch to doing this instead:
$logger->info("User {userId} bought {productName}", [
'userId' => $userId,
'productName' => $productName,
]);
And if your logger isn't handling that properly, it means you have it misconfigured and need to fix your configuration.
If your project's documentation is telling you to do the first one, then your project's documentation is wrong, and it should be fixed.
If you want to understand why, read on.
Continue reading this post on PeakD.