DerEuroMark View RSS

A blog about Frameworks (CakePHP), MVC, Snippets, Tips and more
Hide details



CakePHP Fixture Factories 2.0 13 May 5:16 PM (yesterday, 5:16 pm)

Table of Contents

A clean v2 API is coming

dereuromark/cakephp-fixture-factories just shipped 2.0.0 RC. After a year of incremental cleanup — typed persistEntity() / persistEntities(), the TEntity template on BaseFactory, redirected deprecations — the next major was the right moment to redesign the surface coherently rather than keep layering.

This post walks through the why, the what, and the how to upgrade.

TL;DR

  • One small, internally consistent API: new(), from(), count(), build(), buildMany(), save(), saveMany(), state(), sequence(), sequenceField(), for(), has(), with(), recycle(), query(), table().
  • Every fluent call returns a fresh factory. No more “oops, that test polluted the next one because the factory was reused”.
  • Sharp static analysis: subclass @extends BaseFactory<\App\Model\Entity\Article> once; build(), buildMany(), save(), saveMany() all resolve to the concrete entity type from there.
  • A bundled Rector config covers the mechanical call-site renames so the upgrade is mostly one command.
  • Generator backend is pluggable and auto-detected: install fakerphp/faker or johnykvsky/dummygenerator, the factory picks whichever is available with no config required, and you can plug your own adapter.
  • recycle($entity) reuses an already-built parent across multiple belongsTo branches of an association tree, instead of silently building duplicate parents N times.
  • TableAssertionsTrait adds direct database-state assertions (assertTableHas, assertTableCount, assertEntityExists, …) with failure messages tuned for factory-driven tests.

Why we wanted to modernize

The original design from vierge-noire/cakephp-fixture-factories is what made fast, factory-driven testing the default in CakePHP land in the first place — credit where it’s due. But the codebase was shaped by a PHP 7 / early-PHP-8 sensibility, and a few things had drifted out of step with where the surrounding ecosystem landed:

  • PHP 8.2+ features change what’s idiomatic. Readonly-style immutability, native enum support, first-class callable syntax, sharper generic templates — all of these unlock cleaner shapes than were possible when the API was first drawn. Immutable fluent factories and BackedEnum cases as first-class values both belong in this bucket.
  • The static-analysis bar moved. PHPStan level 8 with strict generics is now table stakes for serious projects. The single @extends BaseFactory<\App\Model\Entity\Article> line carrying through to every terminal — build(), buildMany(), save(), saveMany(), from() — is a meaningful upgrade over the per-method docblock dance that 1.x had to do.
  • The factory-DSL space matured. Laravel’s factories, Foundry in Symfony land, and our own v2 design iterations surfaced patterns worth borrowing — the count() modifier separated from the entry call, directional for() / has(), named state methods over inline state(...) shapes, sequence/cycle helpers. Some of that fits Cake idiomatically, some had to be adapted (we use save over Laravel’s create because Table::save() is the native verb), but the shape benefits from the cross-pollination.
  • The 1.x surface grew organically. make() / makeMany() / getEntity() / getEntities() / persist() / persistEntity() / persistEntities() plus static finders on the factory class — each addition made sense in its moment, but together they were seven-plus terminals you had to keep straight, plus a result-set return path that looked like a query result but wasn’t quite, plus mutable factories that could leak state across reuse. v1.4 cleaned up the typing without changing the shape; v2 is the right moment to redraw the shape itself.

The brief I gave myself going in: a small, internally consistent surface that one paragraph teaches end to end, no footguns the docs have to warn around, and crisp generics so the IDE just knows. Everything below follows from that.

What v2 looks like

Entry and terminals

// Singular, in-memory
$user = UserFactory::new()->build();
// Singular, persisted
$user = UserFactory::new()->save();
// Plural via count() — note the explicit *Many() terminal
$users = UserFactory::new()->count(5)->saveMany();
// Plural with overrides applied to all
$users = UserFactory::new(['admin' => true])->count(3)->saveMany();

build / save was chosen over Laravel’s make / create because save resonates with Table::save() in the CakePHP idiom and create collides with several Cake-side meanings. saveMany() mirrors Table::saveMany() exactly.

State, layered three ways

// Inline — the ad-hoc override
$user = UserFactory::new()->state(['name' => 'Foo'])->save();
// Per-row variation across count()
$users = UserFactory::new()
->count(3)
->sequence(
['role' => 'admin'],
['role' => 'editor'],
['role' => 'user'],
)
->saveMany();
// Single-column variation that stacks across calls
$articles = ArticleFactory::new()
->count(6)
->sequenceField('status', 'draft', 'published') // 2-cycle
->sequenceField('priority', 1, 5, 10) // 3-cycle
->buildMany();

sequenceField() is the new addition. It cycles a single column independently of sequence(), and stacks across different fields with their own cardinalities — the example above produces an LCM-of-6 pattern across status × priority without you having to spell out all six combinations. It also accepts BackedEnum cases natively:

->sequenceField('status', ...Status::cases())

Lifecycle hooks

$user = UserFactory::new()
->afterBuild(fn (User $user) => $user->name = 'Built name')
->afterSave(fn (User $user) => $user->synced = true)
->save();

Both fire for nested factories too — when a child factory is persisted as part of its parent’s cascading save, its own afterSave() callbacks run on the saved children. (That was a quiet 1.x gap.)

Associations: for(), has(), with()

The split that took the most discussion in #40:

  • for() — belongsTo. Auto-resolves the association from the target factory’s table; takes an optional 2nd-arg alias to disambiguate multi-association schemas.
  • has() — hasOne / hasMany / belongsToMany. Same auto-resolve, with an optional 2nd-arg alias and an optional pivot: named arg for habtm join-row data.
  • with('Alias', $factory) — explicit-alias escape hatch (still works, just verbose).
// belongsTo
$article = ArticleFactory::new()
->for(AuthorFactory::new(['name' => 'Mark']))
->save();
// has-many
$author = AuthorFactory::new()
->has(ArticleFactory::new()->count(3))
->save();

When the parent table has more than one association pointing at the target — say, Messages with both Sender and Recipient belonging to Usersfor() and has() throw, not silently pick. The exception itself is paste-ready:

MessageFactory::for(UserFactory::new()) cannot resolve a unique belongsTo —
`Messages` declares 2 associations targeting `Users`:
- Sender (foreign key: sender_id)
- Recipient (foreign key: recipient_id)
Use the explicit form to disambiguate:
MessageFactory::new()->with('Sender', UserFactory::new())
MessageFactory::new()->with('Recipient', UserFactory::new())

for() and has() also accept the alias inline as a second argument if you’d rather not switch helpers:

$message = MessageFactory::new()
->for(UserFactory::new(['name' => 'Mark']), 'Sender')
->for(UserFactory::new(['name' => 'Lou']), 'Recipient')
->save();
// has() with an alias on a habtm association
$post = PostFactory::new()
->has(TagFactory::new()->count(3), 'PrimaryTags', pivot: ['featured' => true])
->save();

For repeated use, bin/cake bake fixture_factory --methods generates forSender() / forRecipient() wrapper methods on the factory itself, co-locating the alias with the schema knowledge in one place.

Sharing a parent across branches with recycle()

When the same parent shows up on multiple belongsTo branches of a build graph — say, a Country referenced both by User directly and by each Address the user has — 1.x silently built a fresh Country per branch and you’d find out only when a code uniqueness assertion blew up downstream. recycle() hands the factory a pre-built entity to reuse anywhere the graph encounters its source table:

$country = CountryFactory::new(['code' => 'DE'])->save();
$users = UserFactory::new()
->count(5)
->recycle($country)
->has(AddressFactory::new()->count(2)) // Address also belongsTo Country
->saveMany();

You end up with 5 users + 10 addresses + 1 country, not 5 users + 10 addresses + 15 countries. Pass several entities (or factories) to recycle() to cover multiple shared parents at once.

Reads: the static factory surface stays small

// Direct table access (replaces 1.x Factory::get($id))
$article = ArticleFactory::table()->get($id);
// Dedicated query starting point (replaces static Factory::find / count)
$published = ArticleFactory::query()->find('published')->all();

Three statics total: ::new(), ::from(), ::query(). Plus ::table() for the Table instance. That’s the whole static surface.

Wrap an existing entity with from()

$article = $articlesTable->newEntity(['title' => 'Existing']);
$factory = ArticleFactory::from($article);

from(EntityInterface) keeps the entity’s identity intact — _accessible, _virtual, source alias all survive the trip. Unlike state(EntityInterface) which extracts via toArray(). v2 explicitly rejects combining from($entity) with count(>1) (it never produced N distinct entities anyway) and points you at the proper alternative in the error: new($entity->toArray())->count(N).

Sharper static analysis

Declare the entity type once on the factory:

/**
* @extends \CakephpFixtureFactories\Factory\BaseFactory<\App\Model\Entity\Article>
*/
class ArticleFactory extends BaseFactory
{
// ...
}

…and PHPStan / Psalm resolve build(), buildMany(), save(), saveMany(), from() and friends to Article and array<Article> everywhere they’re called. No per-method overrides, no @phpstan-return magic.

If you have existing factories, the bundled FactoryAnnotatorTask keeps the docblocks in sync. With dereuromark/cakephp-ide-helper installed, bin/cake annotate classes (or annotate all) walks tests/Factory/ automatically.

Database-state assertions

TableAssertionsTrait adds a small set of assertion methods that read off Factory::query() with failure messages tuned for fixture-driven tests:

use CakephpFixtureFactories\TestSuite\TableAssertionsTrait;
class ArticleSyncServiceTest extends TestCase
{
use TableAssertionsTrait;
public function testSyncImportsExpectedRows(): void
{
ArticleFactory::new(['title' => 'Old', 'status' => 'draft'])->save();
$this->articleSyncService->run();
$this->assertTableCount(ArticleFactory::class, 3);
$this->assertTableHas(ArticleFactory::class, ['title' => 'New', 'status' => 'published']);
$this->assertTableMissing(ArticleFactory::class, ['status' => 'broken']);
}
}

Available helpers: assertTableHas, assertTableMissing, assertTableCount, assertTableEmpty, assertEntityExists, assertEntityMissing. Compared to hand-rolling $this->fetchTable('Articles')->find()->where(...)->count() in tests, the failure messages tell you which factory, which conditions, and what the actual rows look like instead of just “Failed asserting 2 matches 3”.

Named entity pools with Story

The new Story scenario abstract is for fixtures with a bit of structure — when you want a named pool of users (”Admins”, “Editors”), draw random members from it for related rows, and have the whole thing build in one call:

class BlogStory extends Story
{
public function build(): void
{
$this->addToPool('Admins', UserFactory::new(['role' => 'admin'])->count(2)->saveMany());
$this->addToPool('Authors', UserFactory::new(['role' => 'author'])->count(5)->saveMany());
foreach ($this->getPool('Authors') as $author) {
ArticleFactory::new()
->count(3)
->for($author)
->saveMany();
}
}
}

addToPool, getPool, getRandom, getRandomSet are the surface; existing FixtureScenarioInterface implementations keep working unchanged.

Test isolation without $fixtures arrays

CakePHP’s classic fixture flow asks you to enumerate every table a test class might touch in a $fixtures property. That works, but it drifts: someone adds a Behavior that hits Logs, the test class doesn’t know, the fixture array doesn’t update, and you find out hours later when CI complains about leaked rows.

FactoryTransactionStrategy flips the model. Configure it once in config/app.php:

'TestSuite' => [
'fixtureStrategy' => \CakephpFixtureFactories\TestSuite\FactoryTransactionStrategy::class,
],

Every test then runs inside a transaction on the primary connection, opened at setupTest() and rolled back at teardownTest(). Anything written during the test — factories, direct $table->save($entity) calls, raw $connection->execute('INSERT ...') — is automatically reverted. No per-class $fixtures array, no manual cleanup.

Two extras worth knowing:

  • Multi-database setups still skip transactions on connections they never write to. The strategy opens eagerly on the primary connection (default test, override via the protected string $primaryConnection property in a subclass). Secondary connections are tracked lazily through BaseFactory::save() / saveMany() — so a test that only uses one connection only opens one transaction.
  • Generator unique-state resets between tests. The strategy clears the cached generator instances at teardown, so the second test in a class doesn’t inherit the first test’s unique() history and can’t trip OverflowException on a small value space.

For the rare cases that need real commits — code that depends on Model.afterSaveCommit, commit-triggered behaviors, or rows being durably visible across a separate connection — opt into the lazy variant per-class with LazyTransactionTrait, or fall back to Cake’s Eager strategy as a temporary pressure valve. The upgrade guide covers the trade-off.

Pluggable generator backend

The generator behind definition() is no longer hard-wired to Faker. v2 introduces a GeneratorInterface with two adapters in the box:

You don’t have to pick one in config. The resolver runs in this order:

  1. Explicit $type argument to CakeGeneratorFactory::create() — wins if you pass one.
  2. Configure::read('FixtureFactories.generatorType') — wins next, for projects that want to pin a choice.
  3. Auto-detection — if Faker\Generator is loaded, use faker; otherwise fall back to DummyGenerator\DummyGenerator if that’s installed; throw a clear FixtureFactoryException with installation guidance when neither is available.

Faker stays the tiebreaker when both libraries are installed (preserving the prior default), but a project that only depends on johnykvsky/dummygenerator no longer needs to declare anything in config/app.php to make it work — the factory just picks it up.

Override globally or per call when the auto-detected default isn’t what you want:

// config/app.php — pin Dummy regardless of what's installed
'FixtureFactories' => [
'generatorType' => 'dummy',
],
// Or per call, scoped to one factory instance
$article = ArticleFactory::new()->setGenerator('dummy')->build();

Why pick one over the other?

  • Faker gives you realistic-looking data and the broadest provider surface — names, addresses, jobTitles, IBANs. The price is that it seeds via PHP’s process-global mt_srand, which interacts with PHPUnit’s randomized test ordering and with anything else in the process touching mt_rand. Two CI runs of the same seeded test can drift because of factors outside the seed.
  • DummyGenerator uses a per-instance XoshiroRandomizer seeded explicitly through the adapter, so the sequence depends only on the seed and the call order on that specific generator instance. CI failures on row 47 of a 100-row factory output replay locally with the same seed. It’s also smaller and faster — no locale catalogue to load — at the cost of a narrower provider surface.

For reproducible test data either way, set the seed once:

'FixtureFactories' => [
'seed' => 1234,
'defaultLocale' => 'en_US', // explicit beats I18n fallback
],

If a test is flaky on the lowest-deps matrix and you can’t pin down why, switching that one test class to Dummy is often enough to confirm whether the flake was a Faker mt_srand interaction.

How to upgrade

The package ships a Rector config that covers the safe, mechanical call-site changes:

vendor/bin/rector process tests --config vendor/dereuromark/cakephp-fixture-factories/rector.php

The bundled rules cover:

  • Factory::make(...)Factory::new(...)
  • Factory::make($data, $n)Factory::new($data)->count($n)
  • setDefaultTemplate() wrappers → definition(GeneratorInterface $generator)
  • getEntity()build(), getEntities()buildMany()
  • persistEntity()save(), persistEntities()saveMany()
  • patchData(...)state(...) (in factory helper methods such as asAdmin())
  • static query helpers like Factory::find()Factory::query()
  • Factory::get($id, $opts)Factory::table()->get($id, $opts)

A few things rector intentionally doesn’t do — like rewriting deprecated persist() calls, because that return type is shape-dependent and needs a human choice between save() and saveMany(). The Factory::find() rule also splits by arity: zero-arg Factory::find() rewrites to Factory::query() directly (CakePHP 5’s SelectQuery::find() requires a finder name, so the chained form would error at runtime), while calls with an explicit finder name keep the chained Factory::query()->find('name') shape.

There’s also a small set of behavior changes since 1.4 that aren’t mechanical — setGenerator() is instance-scoped by default now, setDefaultTemplate() is no longer wired up (the rector handles the rewrite, but if you skip rector your factories produce empty data silently), FactoryTransactionStrategy is eager again on the primary connection. The full list is in the upgrade guide.

What’s next

v2 is out and tested against a real sandbox app, currently available as 2.0.0-rc.1 for adopters who want to upgrade before the final tag drops. If you’re on the 1.4.x line, follow the upgrade guide — the migration tooling does most of the work, and the rest is documented as you hit it.

Issues, PRs, and “this confused me when I tried it” posts on the issue tracker all welcome.

Thanks

A big thank-you to pabloelcolombiano and the vierge-noire team for building cakephp-fixture-factories in the first place. It shaped how a generation of CakePHP projects write tests (fast, factory-driven, no $fixtures-array gymnastics), and this continued version stands entirely on that foundation. With the project not being maintained anymore in 2025, we decided to take over. The redesign is a continuation of their work.

Links

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

CakePHP AuditStash 2.0: Beyond CRUD 4 May 4:36 AM (11 days ago)

Table of Contents

The cakephp-audit-stash plugin has grown a lot of new surface between 1.x and the current 2.0. What started out as a behavior that records entity-level CRUD into an audit_logs table is now a full mini-app for observability: custom action events beyond CRUD, a dashboard, a coverage report, native chat alerting, and a streaming exporter.

This post walks through the highlights of the new major release, grouped by what they unlock for you as a maintainer of an audited Cake application.

Dashboard

Log anything, not just CRUD

2.0’s headline feature is custom action events. The audit trail is no longer limited to Created / Updated / Deleted rows tied to an entity — anything you want to leave a forensic record of can flow through the same persister, the same hash chain, and the same admin viewer.

use AuditStash\Audit;
Audit::log(
type: 'user.login',
source: 'Users',
primaryKey: $user->id,
data: ['ip' => $request->clientIp()],
meta: ['user_id' => $user->id, 'user_display' => $user->name],
);

Behind the scenes:

  • A new AuditStash\Audit static facade dispatches the event.
  • EventFactory falls back to a new AuditCustomEvent for unknown types.
  • audit_logs.type widened from VARCHAR(7) to VARCHAR(64) so dotted scope strings fit.
  • The admin templates (timeline, view, index filter dropdown, email alert) all learned to render custom events with a neutral grey marker and a generic Event payload card instead of mis-rendering them as deletions.

BC note: $auditLog->type is now a plain string instead of the AuditLogType enum. If you need the enum form, call AuditLogType::tryFrom($log->type).

A real admin dashboard

The plugin now ships an at-a-glance dashboard at the admin root (configurable via AuditStash.routePath) so you stop landing on a paginated list as the first thing you see.

What’s on it:

  • KPI cards: events today, active users, active sources (7d), coverage percentage.
  • Daily activity chart over the last 30 days — pure CSS stacked bars, no chart library dependency.
  • Top sources and top users over 7 days, click-through to the filtered viewer.
  • Recent activity table reusing the existing AuditHelper::eventTypeBadge and formatRecord helpers.

Alongside it, a new Coverage report at /admin/audit-stash/coverage answers the question “which of my tables are actually being audited?”:

  • Discovers Table classes from the app and every loaded plugin via Plugin::getCollection().
  • Three statuses: Tracked (class exists + behavior attached), Missing (class exists but behavior NOT attached — a coverage gap), and Empirical (events recorded for a source we can’t map to a class — custom event sources, renamed tables, uninstalled plugins).
  • Configurable deny-list via AuditStash.coverage.hidePlugins / AuditStash.coverage.hideTables.
  • A self-recursion guard so the plugin doesn’t try to audit its own audit tables.

Native Slack and Discord alert channels

AuditMonitor already supported alert delivery, but until now you had to hand-write a Channel subclass to get a readable Slack or Discord message. The new release ships two platform-native channels you can drop in directly:

'channels' => [
'class' => SlackChannel::class,
'url' => env('SLACK_WEBHOOK_URL'),
],
  • SlackChannel uses Block Kit (header / section / fields blocks) with a severity-colored attachment. Optional username / icon_emoji / channel overrides. Fields are mrkdwn-escaped, so a source containing < or @everyone can’t smuggle markup or pings.
  • DiscordChannel uses the embed format with a decimal-RGB sidebar color and inline fields. Sets allowed_mentions: { parse: [] } defensively, omits the timestamp key when the audit row has no created, and normalises empty / null field values to n/a so Discord doesn’t reject the payload.
  • Both channels link back to the admin view of the row that triggered the alert, so chat recipients can jump straight into the entry instead of pasting source / PK into the URL bar.

The shared HTTP, retry and error-logging plumbing was extracted into a new AbstractWebhookChannel — the documented extension point for whatever platform-native schema your tenant prefers.

Teams users: there’s no bundled TeamsChannel because Microsoft is sunsetting MessageCard incoming webhooks in favor of Adaptive Cards via Power Automate Workflows, which has a fundamentally different setup and trigger model. The Building your own channel docs section points at AbstractWebhookChannel as the extension seam for whatever schema your tenant currently accepts.

Lifecycle hooks for the monitor

Channels are the happy-path delivery mechanism — but they’re a closed set. Anything beyond the bundled platforms (per-context suppression, alert mutation, forwarding to Sentry, custom incident stores) used to require subclassing.

Two new events on the global EventManager fire around the existing rule-check / alert-send flow:

Event When What you can do
AuditStash.Monitor.beforeAlert After rule.matches() and createAlert(), before any channel runs stopPropagation() to suppress; setData('alert', $new) to replace
AuditStash.Monitor.afterAlert After every channel finished Inspect the [channelName => bool] results map for partial failures

Rule-failure routing went a different way: instead of a third event, the rule-failure logger call now passes the full Throwable ('exception' => $e) so PSR-3 handlers like Monolog’s IntrospectionProcessor or the sentry/sentry Cake bridge pick up the stack natively. No new API surface, full forwarding to your existing error pipeline.

A real export workflow

Export is now its own page, not an inline button. Three pieces:

  • AuditStash\Service\ExportService — streams the query in configurable batches (AuditStash.export.batchSize, default 1000), pre-flights with a count() against AuditStash.export.hardCap (default 100000), and refuses oversized exports with BadRequestException rather than silently truncating.
  • A dedicated /admin/audit-logs/export form page showing the active-filter summary, row-count estimate, format picker (CSV / JSON / NDJSON), and a disabled submit button when the cap would be exceeded.
  • A controller action that streams via php://temp + Laminas\Diactoros\Stream — bounded PHP-process memory, spills to disk past 2 MB.

NDJSON joins CSV and JSON for streaming-friendly machine consumers. Filters carry from the index page through to the form via query string and on through to the streaming download URL — so the row count you confirm is the row count you get. The default 30-day created-at floor is skipped when other narrowing filters (source, primarykey, transactionkey, etc.) are already set, so a deliberately-narrowed view never silently exports zero rows.

Security: deny-by-default admin access

This is the change most likely to bite on upgrade, and it’s deliberate.

Audit logs commonly contain who-did-what records — PII, IP addresses, before/after field values for every change. An accidentally forgotten host-side route guard would expose more than a typical admin page. So the plugin now refuses to serve any admin action unless AuditStash.adminAccess is explicitly set to a Closure. A missing config key, a non-Closure value, a Closure that returns anything other than literal true, or a Closure that throws — all yield a 403.

The config key was also renamed from accessCheck to adminAccess to align with the cakephp-queue posture and to read more naturally (describes what is gated rather than the function shape).

Escape hatch for users who want to delegate fully to their host AppController auth:

'AuditStash' => [
'adminAccess' => fn() => true,
],

That’s now an explicit “I trust the upstream guard” choice rather than an accidental forgotten gate.

Forensic capture and a sensitive-field rule

Two opt-in observability additions:

EnvironmentMetadata learned a new capture constructor argument so applications can opt in to request-derived meta fields:

new EnvironmentMetadata(
request: $request,
capture: ['user_agent', 'referer', 'session_id'],
);

Off by default because these can carry PII / GDPR implications. Empty headers and inactive sessions are skipped so the audit row never gains an empty-string column. Unknown field names are filtered against an allow-list, so a typo can’t smuggle arbitrary values into meta.

SensitiveFieldRule is a new monitor rule that fires when a configured field on a configured table appears in an audit row’s changed (create / update) or original (delete) payload. Mirrors MassDeleteRule’s shape, slots into the existing channel pipeline with no infrastructure changes, defaults to high severity. Ideal for “alert me when anyone touches users.password_hash style rules.

Tamper-evident audit logs

This one actually shipped back in 1.1.0 but never got a proper writeup, so it’s worth covering alongside 2.0: an opt-in SHA-256 hash chain over persisted audit rows, giving AuditStash the integrity guarantees that regulated environments — GoBD (DE), SOX (US), HIPAA, and friends — expect from the audit trail itself.

Each persisted row carries two new columns — prev_hash (the previous row’s hash) and hash (SHA-256 over the canonicalized current row plus that previous link). Editing any historical row breaks the chain at that row and every row after it; a verifier walking the table catches the break and points at the offending row.

Disabled by default. To turn it on:

'persisterConfig' => [
'hashChain' => true,
],

…plus the new migration that adds prev_hash / hash / idx_hash. Rows written before the migration stay NULL in both columns — the chain simply anchors at the first row written after you flip the flag, so there’s no destructive backfill step.

A few mechanics worth flagging:

  • Verification is a shipped CLI: bin/cake audit_stash verify_chain [--table=... --chunk=...] — streams the table in bounded memory and exits 1 on the first broken link with a human-readable reason. Drop it in cron or CI.
  • The whole logEvents() batch runs in a single transaction; on MySQL / Postgres the chain tail is read with SELECT ... FOR UPDATE so concurrent writers serialize on it instead of orphaning links. SQLite’s database-level locking gives the equivalent guarantee.
  • Hashing is schema-aware — only fields that exist on the target audit table are included in the digest, so custom audit tables (without e.g. display_value) verify cleanly without payload divergence across installs.
  • Fail-loud: a save() failure mid-batch throws RuntimeException rather than silently dropping a row and breaking the chain at the tail.

Only TablePersister implements this — the Elastic Search persister can’t offer the same ordering guarantee, and the flag is ignored there. If you need tamper-evidence on Elastic, route audit events through SQL first and replicate downstream.

Full rationale, concurrency semantics, and the truncation-at-tail limitation (plus anchoring / heartbeat mitigations) are written up in docs/tamper-evidence.md.

Tracking file uploads (hashes, not content)

Audit rows are not the place to stash uploaded file blobs. The clean approach is a virtual field on the entity that exposes a stable fingerprint — a hash over the stored bytes, a CDN ETag, byte length, whatever is cheap and deterministic for your storage:

// src/Model/Entity/Document.php
protected function _getFingerprint(): ?string
{
return $this->file_path
? hash_file('sha256', WWW_ROOT . $this->file_path)
: null;
}

Virtual fields participate in entity diffs like any other column, so the audit row records “fingerprint changed from abc… to def…” without your audit table ever seeing the file body. No audit-side machinery required.

For legacy schemas where you can’t add a virtual field — uploads tracked in a sibling join table the audited entity doesn’t know about, for example — there’s a AuditStash.beforeLog event hook that fires before the audit row is persisted. One caveat called out explicitly in the docs: BaseEvent has no public setter for changed / original, so the event-hook path needs a small persister decorator to actually merge your additions in. The pattern is in docs/usage.md.

For the related case of recording that a sensitive field changed without storing the value itself, the existing 'sensitive' behavior config is still the right tool — no new machinery there either.

Testing helper trait

If you maintain a downstream application that uses audit-stash, you’ve probably written a one-off helper to assert “this controller action produced an audit row”. There’s now a shipped one:

use AuditStash\TestSuite\AuditAssertionsTrait;
class OrdersControllerTest extends TestCase
{
use AuditAssertionsTrait;
public function testCreateLogsAuditEntry(): void
{
$this->post(['controller' => 'Orders', 'action' => 'add'], [...]);
$this->assertAuditLogged('Orders');
$this->assertAuditFieldChanged('Orders', 'status', null, 'pending');
}
}

Mixes into any TestCase that loads plugin.AuditStash.AuditLogs. Exposes assertAuditLogged, assertAuditNotLogged, assertAuditCount, assertAuditFieldChanged, plus a buildAuditQuery seam for custom assertions. Queries the audit_logs table directly so tests verify what was persisted, not just what an in-memory event queue produced.

The new docs/testing.md covers the full reference and the buildAuditQuery extension seam.

Where to next

If you’re upgrading from 1.x to 2.0, the two things to look at on the way in are:

  1. Set AuditStash.adminAccess to an explicit Closure — even fn() => true is fine if you trust your upstream guard. A missing config key now yields a 403.
  2. Run the migration that widens audit_logs.type to VARCHAR(64). It also drops the EnumType mapping on the column, which means $auditLog->type now returns a string instead of an AuditLogType enum.

Everything else is additive.

Feedback, bug reports, and feature requests (ideally as PRs) welcome over at the GitHub repo.

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

ACL Is Back in CakePHP 27 Apr 3:24 AM (18 days ago)

Table of Contents

And This Time It Grew Up

Remember CakePHP 2’s ACL? The ACO/ARO trees, the aros_acos join table, the tutorial that taught a whole generation of us what “hierarchical permissions” even meant? That was a big idea for its time — permissions as data, managed at runtime, not baked into code. A lot of us learned authorization concepts from that component, and the DNA of today’s tools goes right back to it.

Some people remember: It was painful and slow to work with it, though.

Then the ecosystem evolved. CakePHP 3, 4, and 5 shipped cakephp/authorization and cakephp/authentication — clean, policy-based, composable. TinyAuth kept the lightweight INI-file style alive for teams who wanted configuration over code. Both approaches are great at what they do.

Maybe you also read my last blog post about authz topic.

The one thing that stayed a little wistful was the admin-UI story. Policies live in code; INI files live on disk; and every so often a project would ask, “Can ops toggle this without a deploy?” The honest answer used to be “not really”. That’s the gap TinyAuth Backend 3.x closes — and it’s the reason ACL, as a concept, is quietly having a comeback.

A complete rewrite, on purpose

TinyAuth Backend 3.x is a full rewrite. The breaking changes are real — the old tiny_auth_allow_rules and tiny_auth_acl_rules tables are dropped by the migration, PHP 8.2 and CakePHP 5.1 are the new minimums, and existing permissions need to be re-imported or recreated. The payoff is a plugin that feels native to modern CakePHP instead of bolted on around it.

Dashboard

The easiest way to understand the shape of the rewrite is to look at the schema, the UI, and the integration points side by side.

A normalized schema, eight tables deep

The old 2-table layout has been replaced with eight properly normalized tables:

Table Purpose
tinyauth_roles Roles, with parent/child hierarchy
tinyauth_controllers Discovered controllers (plugin / prefix / name)
tinyauth_actions Controller actions, with a public flag
tinyauth_acl_permissions Role-to-action grants, with optional rule descriptions
tinyauth_resources Entity resources for resource-based auth
tinyauth_resource_abilities Abilities per resource (view, edit, delete, publish, …)
tinyauth_scopes Reusable conditions (e.g. “own records”, “same team”)
tinyauth_resource_acl Resource-to-role grants, with scope support

Two things stand out here. First, every row is addressable on its own — you can join, query, export, audit, and diff permissions with plain SQL. Second, abilities and scopes are reusable data, not hand-written policy classes. Define own once, apply it to Articles, Projects, Comments, and anything else that has a user_id. That’s the kind of reuse that gets expensive when permissions live in code.

The normalized layout is also what makes features like rule descriptions, inherited-permission rendering, and one-click sync possible at all — they all read from the same tables the runtime enforcement uses.

A real admin UI, not a stale admin skin

The admin panel at /admin/auth/ is written with HTMX + Alpine.js + Tailwind CSS. Toggling a permission is a partial update — no full page reload, no lost scroll position, no “did it save?” anxiety. The layout is standalone: the plugin ships its own chrome, so you don’t have to wrestle your host app’s layout into rendering an admin screen, and it comes with light and dark themes out of the box.

A few details that add up:

  • Tree + matrix navigation on the ACL page: controllers on the left, a role-by-action permission grid on the right.
  • Inherited permissions render as visibly inherited — a different state than direct grants — so you can tell at a glance which cells come from a role and which come from a parent role.
  • Drag-and-drop role ordering for building and adjusting the hierarchy, with parent/child relationships kept consistent as you reorder.

Roles

  • Inline rule descriptions: every ACL rule can carry a short description, editable straight from the toggle endpoint and surfaced as a cell tooltip. This is the feature that solves “why does this rule exist?” six months after the fact. You can leave notes like “legacy carve-out for the migration script” or “reporting team needs read-only access during Q2” right next to the rule.
  • Search across controllers, actions, and roles from a single box in the header.

ACL matrix

None of these are visual polish for its own sake — they exist because once you’re managing permissions in a UI, the UI has to make intent visible. A green dot is information; a green dot with a tooltip explaining why is a conversation your team doesn’t have to have in Slack.

Sync: permissions that keep up with your code

A runtime permission system is only useful if it knows what actions and resources exist. TinyAuth Backend 3.x ships auto-discovery for both:

  • Controller sync (ControllerSyncService) walks your application (and plugins, and prefixes) and writes discovered controllers and actions into tinyauth_controllers / tinyauth_actions. Added a new action this morning? Click Sync in the admin panel at /admin/auth/sync and it appears in the matrix.
  • Resource sync (ResourceSyncService) discovers entity resources and their abilities, so resource-level authorization stays in step with your models.

The sync is idempotent — re-running it won’t clobber your existing grants. Actions that appear in code get added; existing rows are left alone. (Orphans from deleted controllers aren’t auto-pruned yet — you’ll want to clean those up by hand or with a quick SQL query.) Permissions management stops being a manual catch-up chore.

Import / export: a real upgrade path

If you already have a TinyAuth app with auth_allow.ini and auth_acl.ini files, you don’t have to rebuild your permissions from scratch:

bin/cake tiny_auth_backend import allow
bin/cake tiny_auth_backend import acl

That’s the upgrade path from plain TinyAuth to the backend — run the import, and your INI rules show up in the admin UI as editable rows. Going the other way, ImportExportService can export the whole permission set to JSON or CSV, which is genuinely useful for diffing environments, seeding staging, or attaching a snapshot to a pull request.

And if you’re not ready to commit fully? The composite adapters (CompositeAllowAdapter / CompositeAclAdapter) let you keep your existing INI files active and layer DB-backed rules on top, served by a single adapter slot. That’s the gradual-adoption path: switch on the backend, import what you want to migrate, leave the rest in INI, and move rules over at your own pace.

Keeping the admin panel admin-only

The admin panel is at /admin/auth/, which is exactly the kind of URL you want gated. Rather than force you to wire this into your host application’s middleware, TinyAuth Backend 3.x adds a plugin-level hook:

// config/app.php
use Psr\Http\Message\ServerRequestInterface;
'TinyAuthBackend' => [
'editorCheck' => function (mixed $identity, ServerRequestInterface $request): bool {
return $identity !== null && in_array('admin', (array)$identity->roles, true);
},
],

The callable receives the current identity and the request, and runs before every /admin/auth/* action. Return true and you’re in; return anything else and the plugin rejects the request with a 403. Your host application’s middleware stack stays clean, and the gating rule lives next to the plugin config.

Only the features you want

Not every project needs every feature. The backend exposes five main capabilities as independently togglable features — allow, acl, roles, resources, and scopes — and by default each one auto-enables if its backing table exists. That means you can run migrations selectively (say, just tinyauth_actions for public-action management) and the rest simply stays out of the way.

When you want explicit control, the TinyAuthBackend.features config overrides auto-detection:

// config/app.php
'TinyAuthBackend' => [
'features' => [
'allow' => true, // force enabled
'acl' => true,
'roles' => true,
'resources' => false, // force disabled
'scopes' => false,
],
],

Disabled features disappear from the admin navigation entirely — no dead links, no half-rendered pages pointing at tables that don’t exist. This is how you adopt the plugin one capability at a time: start with just allow as a UI for your public-action list, add acl when you want role-level gating, and turn on resources + scopes the day you need entity-level authorization. Each step is a config flag and a migration, not a commitment to the whole stack.

Allow

Flexible role sources

Roles don’t have to live in users.role_id. RoleSourceService supports four role source styles out of the box:

  • A database table (the default, for most apps)
  • A Configure path (for roles defined in config at deploy time)
  • A plain array (for tests, fixtures, small apps)
  • A callable, which is the big one — roles can come from a session, a JWT claim, an LDAP group lookup, an SSO gateway response, or any combination of the above

Whichever source you pick, everything downstream — scopes, hierarchy, matrix UI, policy integration — keeps working unchanged. This is the mechanism that makes the ExternalRoles rung below possible, and it’s a genuinely nice separation of concerns.

First-class cakephp/authorization integration

The policy side is built around four small pieces, all shipped by the plugin:

  • TinyAuthPolicy plugs into cakephp/authorization as a regular policy class. It ships with both entity-level can*() methods and scopeIndex() / scopeView() built in, so $this->Authorization->applyScope($query) narrows list results through the same DB rules that govern entity access. One source of truth for “can see” and “can edit”, no subclassing required.
  • TinyAuthResolver is a ResolverInterface implementation that maps every known entity, table, or SelectQuery to TinyAuthPolicy — transparently unwrapping queries to their repository so the same resolver works for both authorize($entity) and applyScope($query). Cake’s built-in MapResolver fails at the query path, and OrmResolver forces convention-based App\Policy\* classes; TinyAuthResolver avoids both. Pass it an allowlist of classes, or leave it empty to govern everything.
  • EntityIdentity is a minimal IdentityInterface wrapper around a Cake entity, for apps that resolve users from a session, a JWT claim, or an SSO gateway and don’t load cakephp/authentication. The authorization service argument is optional — without it, can() returns false and applyScope() is a pass-through, which is the correct behavior for role-only strategies.
  • TinyAuthService is the programmatic entry point — canAccess($roles, $resource, $ability, $entity, $user) for checks (the first argument is a role alias or an array of aliases), getScopeCondition(...) for query filtering.

Pick your rung: the four-strategy ladder

All of the above can be adopted incrementally. The demo app ships four usage strategies, arranged as a ladder. Start on the rung that fits where your project is today, and climb as needs grow. Every rung is a legitimate destination — you don’t have to reach the top to “win”.

Rung 1 — AdapterOnly: a GUI for your config

You have a CakePHP app. You have an auth_allow.ini. You’d like non-developers to be able to adjust who-can-do-what without a pull request.

AdapterOnly is made for that. No cakephp/authorization component, no policy classes — just role-level request gating, with the admin panel as a friendly front door to the same data. The migration is small, and the mental model barely changes.

// AdapterOnly: role-level request gating, nothing more.
// TinyAuthBackend reads from the DB, the admin UI writes to it,
// your controllers keep doing exactly what they were doing.

A short path to a real win: ops gets a UI, you get your afternoon back.

Rung 2 — FullBackend: the real deal

This is the rung where it becomes the thing you probably pictured when you first read the phrase resource permissions.

Load cakephp/authorization and wire the plugin-provided TinyAuthResolver into your authorization service. One constructor call, one allowlist:

// Application::getAuthorizationService()
use TinyAuthBackend\Policy\TinyAuthResolver;
$resolver = new TinyAuthResolver([
\App\Model\Entity\Article::class,
\App\Model\Entity\Project::class,
]);
return new AuthorizationService($resolver);

That’s the whole wiring. TinyAuthResolver maps both entities and queries to the plugin’s TinyAuthPolicy, transparently unwrapping SelectQuery instances to their repository so the same resolver works for both $this->Authorization->authorize() and ->applyScope(). With that in place, the following works out of the box:

public function edit(string $id)
{
$article = $this->Articles->get($id);
$this->Authorization->authorize($article, 'edit');
// ...
}
public function index()
{
$query = $this->Authorization->applyScope($this->Articles->find(), 'index');
// Query now filtered by the user's scope — "own", "team", whatever.
}

Under the hood, authorize() runs through TinyAuthPolicy::can()TinyAuthService → DB rules → role hierarchy → scopes. Four layers, one call. The demo wires this up for articles (scoped by user_id) and projects (scoped by team_id), and the scope definitions — own, team, department, company — are reusable rows in tinyauth_scopes rather than hand-written policy classes.

Resources

Scopes

The best part: the matrix UI and the runtime enforcement read the same data. If a cell lights up green in the admin panel, it lights up green in the controller. That alignment is genuinely rare, and it pays for itself the first time you debug a permission question.

Rung 3 — NativeAuth: same enforcement, your middleware

A sibling of FullBackend with a different wiring diagram. Teams already running cakephp/authentication can keep owning the identity side of the stack; TinyAuth contributes the policy layer and stays out of the middleware conversation.

The enforcement code is identical — the demo’s NativeAuth controllers literally extend the FullBackend ones. That’s the point: moving between rungs is a wiring change, not a rewrite.

Rung 4 — ExternalRoles: roles from anywhere

Sooner or later, the role stops living in users.role_id. It lives in a JWT claim, an LDAP group, a session from an upstream SSO gateway. ExternalRoles supports exactly that: swap TinyAuthBackend.roleSource for a callable (the demo uses a session-backed one via StrategyMiddleware), and everything else — scopes, hierarchy, the matrix UI, the policy layer — keeps working unchanged.

Changing where roles come from without changing how permissions work is the whole reason this rung exists, and it’s a genuinely nice separation.

Under the hood: a service per concern

The rewrite leans hard on small, focused services. If you want to build tooling around the plugin — custom import formats, a different UI, a scheduled sync — these are the handles:

  • TinyAuthService — central permission checking
  • HierarchyService — role hierarchy traversal and inheritance resolution
  • ControllerSyncService — controller/action discovery from your application
  • ResourceSyncService — entity resource/ability discovery
  • ImportExportService — JSON/CSV export and legacy INI import
  • FeatureService — enable/disable optional features at runtime
  • RoleSourceService — flexible role data source resolution

Each one has a single job and a small surface area. Together they’re what makes the admin UI, the CLI commands, and the authorization integration feel like parts of the same system rather than three things stapled together.

Today’s take on a classic idea

A few things worth celebrating about where TinyAuth Backend 3.x landed:

  • Flat, queryable schema. Eight tables, plain joins, plain SQL, no trees of nodes pretending to be both subjects and objects.
  • A good neighbour to cakephp/authorization. It plugs into the framework’s policy layer rather than replacing it.
  • Permissions as self-documenting data. Rule descriptions, export to JSON/CSV, auditable in a query.
  • A real UI stack. HTMX + Alpine + Tailwind, dark mode included, no full-page reloads, scrolls and search that actually work.
  • A gradual adoption story. Composite adapters, INI import, CLI sync, and four usage strategies you can climb through at your own pace.
  • Identity-agnostic. Roles can come from wherever your auth story actually lives.
  • No lock-in. You can stop at any rung, or step back a rung, without tearing things up.

So, is ACL back?

In spirit, yes. The concept that made ACL interesting in the first place — configure permissions as data, manage them in a UI, enforce them at runtime — is back, and it’s wearing modern CakePHP underneath. Normalized schema, reactive UI, first-class policy integration, flexible role sources, and a gradual adoption path that meets your project where it actually is.

Pick a rung, give it a spin, and see how far up the ladder your project wants to climb. The linked demo app below also has strict CSP and showcases that it works with strictest SecurityHeadersMiddleware implementations for maximum safety.

Links

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

Working with IPs in CakePHP 10 Apr 1:49 PM (last month)

Table of Contents

The problem: env(’REMOTE_ADDR’)

Do not use env('REMOTE_ADDR') or low level env() wrapper directly. Those are always the direct TCP connection source, not necessarily the real IP. Make sure to always use ServerRequest::clientIp() when interacting with the user’s IP address.

When you move a CakePHP application behind a reverse proxy – for example, switching from PHP-FPM to FrankenPHP in Docker behind nginx, this issue becomes visible. Now env('REMOTE_ADDR') holds the internal IP of the proxy (e.g. 172.22.0.1 for Docker’s bridge network).

This breaks anything that relies on the raw environment variable:

  • IP-based blacklists stop matching
  • GeoIP lookups resolve to the wrong location
  • Rate limiting becomes ineffective (all users share one IP)
  • Logging records the proxy IP instead of the actual visitor

You’ll typically notice it first in the logs – every request line shows the same address:

[2026-04-09 14:22:01] login.INFO: User logged in {"user_id":42,"ip":"172.22.0.1"}
[2026-04-09 14:22:07] login.INFO: User logged in {"user_id":17,"ip":"172.22.0.1"}
[2026-04-09 14:22:11] login.WARN: Failed login attempt {"ip":"172.22.0.1"}

That’s the Docker bridge gateway, not your visitors.

The fix: ServerRequest::clientIp() + Middleware

CakePHP’s ServerRequest::clientIp() is proxy-aware. When trusted proxies are configured, it reads the real client IP from the X-Forwarded-For or X-Real-IP headers that the reverse proxy sets. When no proxy is involved, it falls back to REMOTE_ADDR – so it works correctly in both environments.

Step 1: Add a TrustedProxyMiddleware

class TrustedProxyMiddleware implements MiddlewareInterface {
public function process(
ServerRequestInterface $request,
RequestHandlerInterface $handler,
): ResponseInterface {
if ($request instanceof ServerRequest) {
$trustedProxies = Configure::read('App.trustedProxies');
if ($trustedProxies) {
$request->setTrustedProxies((array)$trustedProxies);
}
}
return $handler->handle($request);
}
}

Register it early in your middleware queue – before the error handler.

Step 2: Configure trusted proxy IPs

In app_local.php (or app.php):

'App' => [
'trustedProxies' => [
'127.0.0.1',
'172.16.0.0/12',
'10.0.0.0/8',
'192.168.0.0/16',
],
],

This tells CakePHP to trust X-Forwarded-For headers from these addresses (your local network and Docker subnets).

[!CAUTION] Never add 0.0.0.0/0 or any public IP range to trustedProxies. If you do, any client on the internet can spoof their IP simply by sending an X-Forwarded-For header. Only list addresses you actually control – your proxy, your load balancer, your Docker subnets.

Multiple proxies in the chain

If your traffic goes through more than one proxy – e.g. Cloudflare → nginx → app – the X-Forwarded-For header becomes a comma-separated list like 203.0.113.5, 198.51.100.7, 172.22.0.1. CakePHP walks that list from right to left, skipping addresses that match trustedProxies, and returns the first untrusted one as the real client.

For this to work, you must trust every intermediate proxy IP. With Cloudflare in front, that means adding Cloudflare’s published IP ranges to your trusted list – otherwise CakePHP will stop at the Cloudflare edge IP and treat that as the client.

Alternative: nginx real_ip module

You can also fix this at the web server layer using nginx’s ngx_http_realip_module, which rewrites REMOTE_ADDR before the request reaches PHP. It works, but the application-level approach is usually preferable:

  • Works regardless of the web server (nginx, Caddy, FrankenPHP, Apache, …).
  • Configuration lives with the app, not in ops files.
  • Easier to test and reason about.
  • No surprises if you swap web servers later.

Step 3: Replace env(’REMOTE_ADDR’) everywhere

Search your codebase for direct REMOTE_ADDR usage:

env('REMOTE_ADDR')
$_SERVER['REMOTE_ADDR']

Replace each occurrence with $request->clientIp() or $this->request->clientIp() depending on context.

If you don’t have a request object in scope, fetch the current one statically via Router::getRequest():

use Cake\Routing\Router;
$ip = Router::getRequest()?->clientIp();

This returns the same ServerRequest that already passed through your TrustedProxyMiddleware, so clientIp() stays proxy-aware.

Caveats:

  • Returns null in CLI/shell context or before the request has been dispatched – always null-check.
  • Avoid relying on it in code that might run before middleware (e.g. bootstrap).
  • For testability, prefer passing the request in explicitly where you can; the static accessor is a service-locator fallback.

clientIp() is correct in both scenarios. There is no reason to use env('REMOTE_ADDR') directly in application code. This is especially important if you are a plugin maintainer. As this code is then not directly “adjustable” from the developer’s perspective. So that will cause a bug ticket at some point otherwise.

Step 4: Verify it works

The easiest way to confirm everything is wired up correctly is to use the cakephp-setup plugin, which ships with a built-in IP debug page at /admin/setup/backend/ip. It shows you exactly what clientIp() returns, what headers came in, and which proxies were trusted – so you can spot misconfigurations at a glance.

If you’d rather not pull in the plugin, a tiny debug route does the same job:

$routes->get('/debug-ip', function ($request) {
return new Response(['body' => $request->clientIp()]);
});

Curl it from inside and outside the proxy and confirm you see your real public IP, not the proxy’s internal address.

Console/CLI context

clientIp() only makes sense during an HTTP request. If you have a queue worker or shell command that needs the visitor’s IP – say, to send a “new login from $ip” email asynchronously – you can’t recover it after the fact. Capture the IP at request time and persist it into the job payload, then read it from there in the worker.

Further reading

Bottom line

If you run CakePHP behind any reverse proxy – Docker, nginx, a load balancer, Cloudflare – always use ServerRequest::clientIp() with trusted proxies configured. It’s a one-time setup that prevents a whole class of subtle bugs.

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

TOML Support in PHP 30 Mar 12:25 AM (last month)

Table of Contents

A Complete Guide to php-collective/toml

TOML has gained traction as a configuration format. Rust’s Cargo, Python’s pyproject.toml, and various CLI tools use it. For PHP projects that need to read or write TOML files, php-collective/toml provides a modern parser and encoder with AST access.

Why Consider TOML?

Configuration formats involve trade-offs. YAML offers flexibility but brings complexity. JSON lacks comments and trailing commas. PHP arrays work but aren’t portable. TOML aims for a middle ground: human-readable, unambiguous, and easy to parse.

TOML Syntax Primer

For those unfamiliar with TOML, here’s a quick overview of the format.

Basic key-value pairs:

title = "My Application"
version = 1.2
enabled = true

Tables (sections):

[database]
host = "localhost"
port = 5432
[database.credentials]
username = "admin"
password = "secret"

Arrays:

ports = [8080, 8081, 8082]
hosts = ["alpha", "beta", "gamma"]

Array of tables:

[[servers]]
name = "alpha"
ip = "10.0.0.1"
[[servers]]
name = "beta"
ip = "10.0.0.2"

Inline tables:

point = { x = 1, y = 2 }
database = { host = "localhost", port = 5432 }

Multiline strings:

description = """
This is a longer description
that spans multiple lines.
Whitespace is preserved."""
regex = '''\\d+\\.\\d+'''

Dates and times:

created = 2024-01-15T10:30:00Z
date_only = 2024-01-15
time_only = 10:30:00

Format Comparison: TOML vs YAML vs NEON

Each format has its quirks. Here’s how they compare in practice.

The “Norway problem” – unquoted strings becoming booleans:

# YAML 1.1 and some legacy parsers: This becomes boolean false
country: NO
# YAML 1.1 and some legacy parsers: These also become booleans
answer: yes
enabled: on
# TOML: Always a string, no ambiguity
country = "NO"

Whitespace sensitivity:

# YAML: Indentation matters - this is valid
database:
host: localhost
port: 5432
# YAML: This breaks everything
database:
host: localhost
port: 5432 # Wrong indentation = parse error or wrong structure
# TOML: Indentation is purely cosmetic
[database]
host = "localhost"
port = 5432 # Works fine, though unconventional

Type ambiguity:

# YAML: Is this a string or a number?
version: 1.0 # Parsed as float 1.0
version: "1.0" # Parsed as string "1.0"
version: 1.0.0 # Parsed as string "1.0.0" (silently)
# YAML: Octal numbers surprise
permissions: 0755 # Parsed as decimal 493 in YAML 1.1
# TOML: Explicit typing required
version = 1.0.0 # Parse error - invalid syntax
version = "1.0.0" # String (must quote it)
count = 42 # Integer
ratio = 3.14 # Float

NEON specifics:

# NEON: Entity syntax (used in Nette DI)
service: App\MyService(@dependency, %parameter%)
# NEON: Concise syntax works well for Nette-style configuration
# but entity syntax does not map directly to TOML
Aspect TOML YAML NEON
Whitespace-sensitive No Yes Partial
Comments # # #
Multiline strings Yes Yes Yes
Native datetime Yes Yes No
Type ambiguity Minimal Significant Moderate
Nested depth Verbose Concise Concise
Spec complexity Simple Complex Moderate

Here’s the same configuration expressed in all three formats:

# Application configuration
title = "My App"
version = "2.1.0"
debug = false
[database]
host = "localhost"
port = 5432
name = "myapp"
pool_size = 10
[database.credentials]
username = "admin"
password = "secret"
[cache]
driver = "redis"
ttl = 3600
[[servers]]
name = "alpha"
ip = "10.0.0.1"
roles = ["web", "api"]
[[servers]]
name = "beta"
ip = "10.0.0.2"
roles = ["worker"]
[logging]
level = "info"
format = "json"
created = 2024-01-15T10:30:00Z
# Application configuration
title: My App
version: "2.1.0"
debug: false
database:
host: localhost
port: 5432
name: myapp
pool_size: 10
credentials:
username: admin
password: secret
cache:
driver: redis
ttl: 3600
servers:
- name: alpha
ip: 10.0.0.1
roles:
- web
- api
- name: beta
ip: 10.0.0.2
roles:
- worker
logging:
level: info
format: json
created: 2024-01-15T10:30:00Z
# Application configuration
title: My App
version: "2.1.0"
debug: false
database:
host: localhost
port: 5432
name: myapp
pool_size: 10
credentials:
username: admin
password: secret
cache:
driver: redis
ttl: 3600
servers:
- name: alpha
ip: 10.0.0.1
roles: [web, api]
- name: beta
ip: 10.0.0.2
roles: [worker]
logging:
level: info
format: json
created: 2024-01-15T10:30:00Z

Library Features

Full TOML 1.0/1.1 support with strict validation for keys, tables, strings, numbers, and datetimes. The project also publishes a support matrix for current coverage and known gaps.

Error recovery – Rather than stopping at the first problem, it can collect multiple errors. Useful for tooling and editor integrations:

$result = Toml::tryParse($tomlString);
foreach ($result->getErrors() as $error) {
echo $error->format($tomlString);
}

Simple API for common operations:

$config = Toml::decodeFile('config.toml');
Toml::encodeFile('output.toml', $data);

Separate Lexer/Parser/AST – The architecture allows direct AST access for analysis without full evaluation.

No required extensions – Works out of the box on PHP 8.2+. The php-ds extension is optional for performance.

Error Recovery

When parsing invalid TOML, the library can continue past the first error and collect multiple problems. This is particularly valuable for editor integrations and linters where you want to surface all detected issues at once:

$result = Toml::tryParse($tomlString);
if ($result->hasErrors()) {
foreach ($result->getErrors() as $error) {
echo $error->format($tomlString);
}
}

Each error includes line and column information, making it straightforward to report precise locations in CLI output or IDE diagnostics.

Working with the AST

For tools that need to analyze configuration structure without evaluating it, such as linters, formatters, or editor plugins, direct AST access is available:

use PhpCollective\Toml\Toml;
use PhpCollective\Toml\Ast\Table;
use PhpCollective\Toml\Ast\KeyValue;
$document = Toml::parse($tomlString);
foreach ($document->items as $node) {
if ($node instanceof Table) {
echo "Found table: " . implode('.', $node->key->parts) . "\n";
} elseif ($node instanceof KeyValue) {
echo "Found key: " . implode('.', $node->key->parts) . "\n";
}
}

This separation means you can traverse the document structure, check for specific patterns, or even implement custom validation rules beyond what TOML itself requires.

Real-World Use Cases

Application configuration:

// config/app.toml
$config = Toml::decodeFile(__DIR__ . '/config/app.toml');
$dbHost = $config['database']['host'];
$cacheDriver = $config['cache']['driver'] ?? 'file';

Environment-specific settings:

$env = getenv('APP_ENV') ?: 'development';
$config = Toml::decodeFile(__DIR__ . "/config/{$env}.toml");

Reading Python project configuration:

// Parse a pyproject.toml to extract dependencies or metadata
$pyproject = Toml::decodeFile('/path/to/pyproject.toml');
$projectName = $pyproject['project']['name'];
$dependencies = $pyproject['project']['dependencies'] ?? [];
$pythonVersion = $pyproject['project']['requires-python'];

Plugin or package metadata:

# plugin.toml
[plugin]
name = "My Plugin"
version = "2.1.0"
author = "Jane Doe"
[plugin.requirements]
php = ">=8.2"
extensions = ["json", "mbstring"]
[[plugin.hooks]]
event = "beforeSave"
handler = "App\\Hooks\\ValidateData"
[[plugin.hooks]]
event = "afterSave"
handler = "App\\Hooks\\ClearCache"

Using TOML Inside Framework Apps

These examples are not all equivalent.

  • CakePHP offers a real extension point for custom configuration engines.
  • Symfony can consume TOML during container building, but TOML is not a first-class Symfony config format.
  • Laravel can read TOML during bootstrapping, but its native config format remains PHP arrays in config/*.php.

CakePHP – Config engine integration:

// src/Configure/Engine/TomlConfigEngine.php
namespace App\Configure\Engine;
use Cake\Core\Configure\ConfigEngineInterface;
use Cake\Core\Exception\CakeException;
use PhpCollective\Toml\Toml;
class TomlConfigEngine implements ConfigEngineInterface
{
protected string $path;
public function __construct(string $path = CONFIG)
{
$this->path = $path;
}
public function read(string $key): array
{
$file = $this->path . $key . '.toml';
if (!is_file($file)) {
throw new CakeException("Could not load configuration file: {$file}");
}
return Toml::decodeFile($file);
}
public function dump(string $key, array $data): bool
{
$file = $this->path . $key . '.toml';
return file_put_contents($file, Toml::encode($data)) !== false;
}
}
// Register in bootstrap.php
Configure::config('toml', new TomlConfigEngine());
Configure::load('app_local', 'toml');

This is a proper CakePHP integration point because Configure is designed to work with custom engines.

Symfony – Custom parameter import pattern:

// src/DependencyInjection/TomlExtension.php
namespace App\DependencyInjection;
use PhpCollective\Toml\Toml;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\DependencyInjection\Extension\Extension;
class TomlExtension extends Extension
{
public function load(array $configs, ContainerBuilder $container): void
{
$configFile = $container->getParameter('kernel.project_dir') . '/config/app.toml';
if (file_exists($configFile)) {
$tomlConfig = Toml::decodeFile($configFile);
$container->setParameter('app.toml', $tomlConfig);
}
}
}

This can work in a bundle or application-specific extension, but it is still a custom pattern. Symfony’s standard configuration formats are YAML, XML, and PHP, so this is better framed as importing TOML-derived parameters than as native Symfony config loading.

Laravel – Boot-time import pattern:

// app/Providers/TomlConfigServiceProvider.php
namespace App\Providers;
use Illuminate\Support\ServiceProvider;
use PhpCollective\Toml\Toml;
class TomlConfigServiceProvider extends ServiceProvider
{
public function boot(): void
{
$tomlConfig = config_path('app.toml');
if (file_exists($tomlConfig)) {
config()->set('toml', Toml::decodeFile($tomlConfig));
}
}
}

This is useful when you want TOML-backed application settings inside a Laravel app, but it is not native Laravel config-file support. Keeping the imported data under a dedicated toml key avoids accidentally overwriting existing framework or package config keys.

Migrating from YAML or NEON

Converting existing configuration files is straightforward. Here’s a helper approach:

use PhpCollective\Toml\Toml;
use Symfony\Component\Yaml\Yaml;
// Load existing YAML
$yamlData = Yaml::parseFile('config.yaml');
// Write as TOML
Toml::encodeFile('config.toml', $yamlData);

For NEON (Nette):

use PhpCollective\Toml\Toml;
use Nette\Neon\Neon;
$neonContent = file_get_contents('config.neon');
$neonData = Neon::decode($neonContent);
// Filter out NEON-specific constructs like entities
// that don't translate directly to TOML
$cleanData = array_filter($neonData, fn($v) => !is_object($v));
Toml::encodeFile('config.toml', $cleanData);

Some things to watch for during migration:

  • NEON entities (@service, %parameter%) have no TOML equivalent
  • YAML anchors and aliases need to be expanded
  • Deeply nested structures may become verbose in TOML
  • Datetime literals need to match TOML’s RFC 3339-based syntax

Installation

composer require php-collective/toml

Requires PHP 8.2+. Optional: install php-ds extension for improved performance with large files.

Summary

TOML won’t replace YAML or NEON everywhere, but it has its place — especially when interoperating with tools that already use it. For PHP projects that need TOML support, this library provides a complete implementation.

Personally, I still reach for NEON or classic PHP arrays in most projects — for deeply nested configs, they’re simply more concise. TOML shines in flatter structures and cross-language tooling. What are you using?

Further Resources

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

Data Governance in CakePHP with Bouncer and AuditStash 22 Mar 4:16 AM (last month)

Table of Contents

How two plugins work together to bring enterprise-grade data integrity to your CakePHP application

The Problem: Who Changed What, and Should They Have?

Every application that handles important data eventually faces the same questions:

  • “Who deleted that customer record last week?”
  • “Can we undo the changes made by that intern?”
  • “We need an approval process before price changes go live”
  • “The auditor wants to see every modification to financial records”

These aren’t edge cases. They’re fundamental requirements for any serious business application. Yet many frameworks leave you to implement these patterns from scratch, leading to inconsistent solutions scattered across your codebase.

CakePHP developers now have a comprehensive answer: Bouncer and AuditStash — two plugins that, together, provide complete data governance for your application.

Meet the Plugins

AuditStash: The All-Seeing Eye

AuditStash automatically tracks every change to your data. Every create, update, and delete is logged with:

  • Who made the change (user ID and display name)
  • When it happened (precise timestamps)
  • What changed (field-level diffs)
  • Where it came from (CLI, web, or API)
  • Why it was grouped (transaction IDs for related changes)
// In your Table class
public function initialize(array $config): void
{
parent::initialize($config);
$this->addBehavior('AuditStash.AuditLog', [
'blacklist' => ['password', 'token'],
]);
}

That’s it. Every change to this table is now permanently recorded.

Bouncer: The Gatekeeper

Bouncer intercepts changes before they happen. Instead of immediately saving user submissions, it stores them as drafts pending approval:

// In your Table class
public function initialize(array $config): void
{
parent::initialize($config);
$this->addBehavior('Bouncer.Bouncer', [
'actions' => ['add', 'edit', 'delete'],
]);
}

Now when a user saves a record:

  1. The change is captured as a draft
  2. An administrator or moderator reviews the proposed change
  3. Side-by-side diff shows exactly what will change
  4. Admin approves or rejects with optional notes
  5. Only approved changes modify the actual data

The Power of Combination

Here’s where it gets interesting. These plugins aren’t just useful individually — they’re designed to complement each other.

Scenario: Financial Application

Consider a financial application where:

  • Junior staff can propose changes to accounts
  • Senior staff must approve changes over $10,000
  • All changes must be auditable for compliance
  • Regulators may request complete change history
// AccountsTable.php
public function initialize(array $config): void
{
parent::initialize($config);
// Track ALL changes for compliance
$this->addBehavior('AuditStash.AuditLog', [
'blacklist' => ['internal_notes'],
]);
// Require approval for modifications
$this->addBehavior('Bouncer.Bouncer', [
'actions' => ['edit', 'delete'],
'bypassCallback' => function ($entity, $options) {
// Auto-approve small changes by senior staff
$user = $options['user'] ?? null;
$isSenior = $user && $user->role === 'senior';
$isSmallChange = abs($entity->balance - $entity->getOriginal('balance')) < 10000;
return $isSenior && $isSmallChange;
},
]);
}

The result:

Action Junior Staff Senior Staff (< $10k) Senior Staff (>= $10k)
Propose change Draft created Auto-approved Draft created
Audit logged Yes (draft) Yes (immediate) Yes (draft + approval)
Requires review Yes No Yes

Scenario: Content Management

A publishing platform where:

  • Writers submit articles
  • Editors review and approve
  • Published content can be reverted if needed
  • Legal needs history for liability
// ArticlesTable.php
public function initialize(array $config): void
{
parent::initialize($config);
$this->addBehavior('AuditStash.AuditLog');
$this->addBehavior('Bouncer.Bouncer', [
'actions' => ['add', 'edit'],
]);
}

The workflow:

Writer submits article
|
v
[Draft Created] --- AuditStash logs: "draft submitted"
|
v
Editor reviews diff
|
/ \
v v
Approve Reject
| |
v v
Published Writer notified
|
v
AuditStash logs: "approved and published"

If a published article causes problems, AuditStash provides the complete timeline:

// In your controller
$timeline = $this->AuditLogs->find('timeline', [
'source' => 'articles',
'primary_key' => $articleId,
]);
// Returns every change, who made it, when, and what changed

Real-World Features That Matter

GDPR Compliance (AuditStash)

When a user requests data deletion under GDPR:

# Export all audit data for a user
bin/cake audit_stash gdpr export --user 42 --output user-42-data.json
# Anonymize user's audit trail (keeps records, removes PII)
bin/cake audit_stash gdpr anonymize --user 42
# Or completely delete if required
bin/cake audit_stash gdpr delete --user 42

Conflict Resolution (Bouncer)

When two users edit the same record simultaneously, Bouncer’s 3-way merge shows exactly what conflicts:

Original: "Product costs $100"
User A: "Product costs $120" (pending)
User B: "Product costs $95" (pending)
Admin sees both proposals side-by-side and decides

Retention Policies (AuditStash)

Keep audit logs manageable with automatic cleanup:

// config/audit_stash.php
return [
'AuditStash' => [
'retention' => [
'default' => '2 years',
'financial_records' => false, // Never delete
'session_logs' => '30 days',
],
],
];
# Clean up old logs (respects retention policies)
bin/cake audit_stash cleanup --force

Smart Change Detection (AuditStash)

Don’t clutter your audit log with noise:

$this->addBehavior('AuditStash.AuditLog', [
'ignoreTimestampOnly' => true, // Skip if only modified changed
'ignoreWhitespace' => true, // Skip whitespace-only edits
'ignoreFields' => ['view_count', 'last_accessed'],
]);

The Admin Interfaces

Both plugins ship with complete admin interfaces:

AuditStash Viewer (/admin/audit-logs)

  • Filter by table, user, date range, event type
  • Full-text search across changed values
  • Timeline view for any record’s complete history
  • Inline and side-by-side diff rendering
  • Export to CSV or JSON

Bouncer Review (/admin/bouncer)

  • Queue of pending drafts
  • Side-by-side comparison with current data
  • Approve/reject with notes
  • Bulk operations for efficient review
  • Display names for human-readable field labels

Both now include self-contained Bootstrap 5 layouts, meaning they work out of the box without requiring your application to use Bootstrap.

Getting Started

Installation is straightforward:

composer require dereuromark/cakephp-audit-stash
composer require dereuromark/cakephp-bouncer

Load the plugins:

// src/Application.php
public function bootstrap(): void
{
parent::bootstrap();
$this->addPlugin('AuditStash');
$this->addPlugin('Bouncer');
}

Run migrations:

bin/cake migrations migrate --plugin AuditStash
bin/cake migrations migrate --plugin Bouncer

Add behaviors to your tables, and you’re protected.

Conclusion

Data governance isn’t optional anymore. Regulations like GDPR, SOX, and HIPAA demand accountability. Users expect undo functionality. Businesses need approval workflows.

AuditStash and Bouncer bring these enterprise patterns to CakePHP with minimal configuration:

  • AuditStash answers “what happened?” with complete change tracking
  • Bouncer answers “should this happen?” with approval workflows
  • Together, they provide complete data governance for any CakePHP application

Both plugins have reached their 1.0.0 stable releases, ready for production use.

Links:

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

DTOs at the Speed of Plain PHP 2 Mar 9:43 AM (2 months ago)

Table of Contents

Zero Reflection, Zero Regrets

Every PHP developer knows the pain. You’re deep in a template, staring at $data['user']['address']['city'], wondering if that key actually exists or if you’re about to trigger a notice that’ll haunt your logs forever.

DTOs solve this. But the cure has often been worse than the disease.

This post aims to:

  1. raise awareness about array > ArrayObject > DTO performance loss
  2. provide a high-speed alternative to reflection libraries with the same feature set (or more)

The Reflection Tax

Modern PHP DTO libraries are clever. Too clever. They use runtime reflection to magically hydrate objects from arrays, infer types from docblocks, and validate on the fly. It’s beautiful—until you profile it.

Every. Single. Instantiation. Pays the reflection tax.

For a simple API endpoint returning 100 users? That’s 100 reflection calls. For a batch job processing 10,000 records? You’re burning CPU cycles on introspection instead of actual work.

And then there’s the IDE problem. Magic means your IDE is guessing. “Find Usages” becomes “Find Some Usages, Maybe.” PHPStan needs plugins. Autocomplete works… sometimes.

What If We Just… Generated the Code?

Here’s a radical idea: what if we did all that reflection once, at build time, and generated plain PHP classes?

Introducing php-collective/dto: The Code-Generation Approach

Data Transfer Objects (DTOs) have become essential in modern PHP applications. They provide type safety, IDE autocomplete, and make your code more maintainable. But the PHP ecosystem has long debated how to implement them: runtime reflection or manual boilerplate?

php-collective/dto takes a third path: code generation. Define your DTOs once in configuration, generate optimized PHP classes, and enjoy the best of both worlds.

Why Another DTO Library?

The PHP DTO landscape in 2026 looks like this:

  • Native PHP 8.2+ readonly classes: Manual implementation
  • spatie/laravel-data: Laravel-specific, runtime reflection
  • cuyz/valinor: Framework-agnostic runtime mapper
  • symfony/serializer: Component-based serialization

These are excellent tools, but they share a common limitation: runtime reflection overhead. Every time you create a DTO, the library inspects class metadata, parses types, and builds the object dynamically.

What if we did all that work once, at build time?

Basic concept

The idea is not that radical after all. Similar implementations have existed for more than 15 years, way before modern PHP and the new syntax and features it brought along. I have been using it for a bit more than 11 years now myself.

You decide on config as XML, YAML, NEON or PHP. PHP using builders is the most powerful one, as it has full auto-complete/type-hinting:

return Schema::create()
->dto(Dto::create('User')->fields(
Field::int('id')->required(),
Field::string('email')->required(),
Field::dto('address', 'Address'),
))
->toArray();

Run the generator:

vendor/bin/dto generate

Get a real PHP class:

class UserDto extends AbstractDto
{
public function getId(): int { /* ... */ }
public function getEmail(): string { /* ... */ }
public function getAddress(): ?AddressDto { /* ... */ }
public function setEmail(string $email): static { /* ... */ }
// ...
}

No magic. No reflection. Just PHP.

What You Get

  1. Perfect IDE Support – Real methods = perfect autocomplete, “Find Usages”, refactoring
  2. Excellent Static Analysis – PHPStan/Psalm work without plugins or special annotations
  3. Reviewable Code – Generated classes appear in pull requests
  4. Zero Runtime Overhead – No reflection, no type parsing per instantiation
  5. Framework Agnostic – Works anywhere PHP runs

History

The concept was first used almost 2 decades ago in e-commerce systems that had a high amount of modular packages and basically disallowed all manual array usage. All had to be DTOs for maximum extendability and discoverability. The project could add fields per DTO as needed. The XMLs of each module as well as project extensions were all merged together. XML makes this easy, and the generated DTOs are fully compatible with both core and project level.

I never needed the “merging” feature, but I did like how quickly you could generate them, and that it could always generate full DTOs with all syntactic sugar as per current “language standards”.

Personally I always liked the XML style, because with XSD modern IDEs have full autocomplete and validation on them. But in some cases PHP might be more flexible and powerful.

Features That Matter

1. Multiple Configuration Formats

Choose what works for your team:

XML (with XSD validation):

<dto name="User">
<field name="id" type="int" required="true"/>
<field name="email" type="string" required="true"/>
<field name="roles" type="string[]" collection="true"/>
</dto>

Or use YAML or NEON for minimal syntax. Or stick to the PHP one above.

2. Mutable and Immutable Options

Mutable (default) – traditional setters:

$user = new UserDto();
$user->setName('John');
$user->setEmail('john@example.com');

Immutable – returns new instances:

$user = new UserDto(['name' => 'John']);
$updated = $user->withEmail('john@example.com');
// $user is unchanged, $updated has new email

Configure per-DTO:

Dto::immutable('Event')->fields(/* ... */);

3. Smart Key Format Conversion

APIs use snake_case. JavaScript wants camelCase. Forms send dashed-keys. Handle all of them:

// From snake_case database
$dto->fromArray($dbRow, false, UserDto::TYPE_UNDERSCORED);
// To camelCase for JavaScript
return $dto->toArray(); // default camelCase
// To snake_case for Python API
return $dto->toArray(UserDto::TYPE_UNDERSCORED);

4. Collections with Type Safety

<dto name="Order">
<field name="items" type="OrderItem[]" collection="true" singular="item"/>
</dto>

Generated methods:

$order->getItems(); // ArrayObject<OrderItemDto>
$order->addItem($itemDto); // Type-checked
$order->hasItems(); // Collection not empty

Associative collections work too:

$config->addSetting('theme', $settingDto);
$theme = $config->getSetting('theme');

Custom collection factories let you use Laravel Collections, Doctrine ArrayCollection, or CakePHP Collection (when generated with a non-\ArrayObject collection type):

Dto::setCollectionFactory(fn($items) => collect($items));
// Now all getters return Laravel collections
$order->getItems()->filter(...)->sum(...);

5. Deep Nesting and Safe Access

$company = new CompanyDto($data);
// Safe nested reading with default
$city = $company->read(['departments', 0, 'address', 'city'], 'Unknown');
// Deep cloning - nested objects are fully cloned
$clone = $company->clone();
$clone->getDepartments()[0]->setName('Changed');
// Original unchanged

6. TypeScript Generation

Share types with your frontend:

vendor/bin/dto typescript --output=frontend/src/types/

Generates:

export interface UserDto {
id: number;
email: string;
name?: string;
roles: string[];
}
export interface OrderDto {
id: number;
customer: UserDto;
items: OrderItemDto[];
}

Options include multi-file output, readonly interfaces, and strict null handling.

7. Field Tracking for Partial Updates

Know exactly what was changed:

$dto = new UserDto();
$dto->setEmail('new@example.com');
$changes = $dto->touchedToArray();
// ['email' => 'new@example.com']
// Perfect for partial database updates
$repository->update($userId, $changes);

8. OrFail Methods for Non-Null Guarantees

Every nullable field gets an OrFail variant:

$email = $dto->getEmail(); // string|null
$email = $dto->getEmailOrFail(); // string (throws if null)

Use after validation to avoid null checks:

$email = $dto->getEmailOrFail(); // PHPStan now knows it is not nullable

9. Required Fields

Enforce data integrity at creation:

<field name="id" type="int" required="true"/>
new UserDto(['name' => 'John']);
// InvalidArgumentException: Required fields missing: id

10. Validation Rules

Beyond required fields, you can add common validation constraints:

Dto::create('User')->fields(
Field::string('name')->required()->minLength(2)->maxLength(100),
Field::string('email')->required()->pattern('/^[^@]+@[^@]+\.[^@]+$/'),
Field::int('age')->min(0)->max(150),
)
Rule Applies To Description
minLength string Minimum string length
maxLength string Maximum string length
min int, float Minimum numeric value
max int, float Maximum numeric value
pattern string Regex pattern validation

Validation runs on instantiation. Null fields skip validation — rules only apply when a value is present.

The validationRules() method extracts all rules as metadata, useful for bridging to framework validators:

$rules = $dto->validationRules();
// ['name' => ['required' => true, 'minLength' => 2, 'maxLength' => 100], ...]

11. Enum Support

<field name="status" type="\App\Enum\OrderStatus"/>
// From enum instance
$order->setStatus(OrderStatus::Pending);
// From backing value - auto-converted
$order = new OrderDto(['status' => 'confirmed']);
$order->getStatus(); // OrderStatus::Confirmed

12. Value Objects and DateTime

<field name="price" type="\Money\Money"/>
<field name="createdAt" type="\DateTimeImmutable"/>

Custom factories for complex instantiation:

Field::class('date', \DateTimeImmutable::class)->factory('createFromFormat')

13. Transform Functions

Apply callables to transform values during hydration or serialization:

Field::string('email')
->transformFrom('App\\Transform\\Email::normalize') // Before hydration
->transformTo('App\\Transform\\Email::mask') // After serialization

Useful for normalizing input (trimming, lowercasing) or masking output (hiding sensitive data). For collections, transforms apply to each element.

14. DTO Inheritance

Share common fields:

Dto::create('BaseEntity')->fields(
Field::int('id')->required(),
Field::class('createdAt', \DateTimeImmutable::class),
)
Dto::create('User')->extends('BaseEntity')->fields(
Field::string('email')->required(),
)
// UserDto has id, createdAt, and email

15. Array Shapes

Every generated DTO now gets shaped array types on toArray() and createFromArray():

// UserDto with fields: id (int, required), name (string), email (string, required)
/**
* @return array{id: int, name: string|null, email: string}
*/
public function toArray(?string $type = null, ?array $fields = null, bool $touched = false): array
  1. IDE Autocomplete$dto->toArray()['na suggests name
  2. Typo Detection$dto->toArray()['naem'] shows error
  3. Type Inference['name' => $name] = $dto->toArray() infers $name as string|null
  4. Destructuring Support – Full type safety when unpacking arrays

16. JSON Schema Generation

Complement your TypeScript types with JSON Schema for API documentation and contract testing:

vendor/bin/dto jsonschema --output=schemas/

Supports --single-file (with $defs references), --multi-file, --no-refs (inline nested objects), and --date-format options.

Also:

  • Property name mapping via mapFrom() and mapTo() — read from email_address in input, write to emailAddr in output
  • Default values for fields
  • Deprecation annotations (IDE warnings for deprecated fields)
  • Union types support (string|int)
  • Generic collection type hints (@return ArrayObject<int, ItemDto>)
  • Computed/derived fields via traits (getFullName() from firstName + lastName)
  • Schema importer (bootstrap DTOs from JSON schemas or OpenAPI 3.x specifications)
  • JSON serialization via serialize()/unserialize()
  • Doctrine mapper generation (--mapper) for SELECT NEW style constructors
  • Collection adapters (CakePHP, Laravel, Doctrine) via adapter registry

Real-World Patterns

API Response Transformation

class UserController
{
public function show(int $id): JsonResponse
{
$user = $this->repository->find($id);
$dto = UserDto::createFromArray($user->toArray());
// Snake case for JSON API
return new JsonResponse($dto->toArray(UserDto::TYPE_UNDERSCORED));
}
}

Form Handling with Partial Updates

public function update(Request $request, int $id): Response
{
$dto = new UserDto();
$dto->fromArray($request->all(), false, UserDto::TYPE_UNDERSCORED);
// Only update fields that were actually submitted
$this->repository->update($id, $dto->touchedToArray());
return new Response('Updated');
}

Event Sourcing with Immutable DTOs

$event = new OrderPlacedDto([
'eventId' => Uuid::uuid4()->toString(),
'aggregateId' => $orderId,
'occurredAt' => new DateTimeImmutable(),
'order' => $orderDto,
]);
// Create corrected version without mutating original
$corrected = $event->withVersion(2);

Performance: The Numbers

We ran comprehensive benchmarks comparing php-collective/dto against plain PHP, spatie/laravel-data, and cuyz/valinor. Test environment: PHP 8.4.17, 10,000 iterations per test.

Versions used: php-collective/dto dev-master (e4e1f9c), spatie/laravel-data 4.19.1, cuyz/valinor 2.3.2. A standalone comparison also includes spatie/data-transfer-object 3.9.1 and symfony/serializer 8.0.5.

Simple DTO Creation (User with 6 fields)

Library Avg Time Operations/sec Relative
Plain PHP readonly DTO 0.27 µs 3.64M/s 2.2x faster
php-collective/dto createFromArray() 0.60 µs 1.68M/s baseline
spatie/laravel-data from() 14.77 µs 67.7K/s 25x slower
cuyz/valinor 15.78 µs 63.4K/s 26x slower

Standalone benchmarks (using spatie/data-transfer-object instead of laravel-data, which requires a full Laravel app) show 52.8K/s and symfony/serializer 106K/s.

Complex Nested DTOs (Order with User, Address, 3 Items)

Library Avg Time Operations/sec Relative
Plain PHP nested DTOs 1.75 µs 571K/s 1.8x faster
php-collective/dto 3.10 µs 322K/s baseline
spatie/laravel-data 48.83 µs 20.5K/s 16x slower
cuyz/valinor 68.67 µs 14.6K/s 22x slower

Standalone nested results: spatie/data-transfer-object 10.6K/s, symfony/serializer 13.6K/s.

The gap widens with complexity. Runtime libraries pay reflection costs for every nested object. Generated code doesn’t.

Serialization (toArray)

Library Avg Time Operations/sec Relative
Plain PHP toArray() 0.68 µs 1.48M/s 1.8x faster
php-collective/dto 1.20 µs 832K/s baseline
spatie/laravel-data 26.95 µs 37.1K/s 22x slower

Property Access (10 reads)

Approach Avg Time Operations/sec
Plain PHP property access 0.11 µs 9.48M/s
php-collective/dto getters 0.20 µs 4.91M/s
Plain array access 0.15 µs 6.77M/s

Getter methods are nearly as fast as direct property access – the small overhead is negligible in real applications.

Mutable vs Immutable Operations

Operation Avg Time Operations/sec
Mutable: setName() 0.08 µs 13.1M/s
Immutable: withName() 0.12 µs 8.34M/s

Immutable operations are ~1.6x slower due to object cloning, but still extremely fast at 8.3 million operations per second.

JSON Serialization

Approach Avg Time Operations/sec
Plain array -> JSON 1.13 µs 888K/s
Plain PHP DTO -> JSON 2.07 µs 484K/s
php-collective/dto -> JSON 2.95 µs 339K/s

At 339K JSON documents per second, this is more than sufficient for any web application. A typical API handles 1K-10K requests/second.

Visual Comparison

Simple DTO Creation (ops/sec, higher is better):
┌──────────────────────────────────────────────────────────────────┐
│ Plain PHP ████████████████████████████████████ 3.64M/s │
│ php-collective ██████████████████ 1.68M/s │
│ laravel-data █ 67.7K/s │
│ valinor █ 63.4K/s │
└──────────────────────────────────────────────────────────────────┘
Complex Nested DTO (ops/sec, higher is better):
┌──────────────────────────────────────────────────────────────────┐
│ Plain PHP ██████████████████████████████████ 571K/s │
│ php-collective ███████████████████ 322K/s │
│ laravel-data ████ 20.5K/s │
│ valinor ███ 14.6K/s │
└──────────────────────────────────────────────────────────────────┘

Key Insights

  1. php-collective/dto is 25-26x faster than runtime DTO libraries for object creation
  2. Only ~2.2x slower than plain PHP — generated code approaches hand-written performance
  3. Serialization is ~22x faster than spatie/laravel-data — generated toArrayFast() avoids per-field metadata lookups
  4. The performance gap grows with nesting – more nested objects = more reflection overhead for runtime libraries
  5. Can process ~322K complex nested DTOs per second – sufficient for any batch processing scenario
  6. Property access and mutability operations are near-native speed

When to Use php-collective/dto

Choose php-collective/dto when:

  • Performance matters (API responses, batch processing)
  • You want excellent IDE and static analysis support
  • You prefer configuration files over code attributes
  • You need both mutable and immutable DTOs
  • You work with different key formats
  • You want to share types with TypeScript frontends
  • You value reviewable, inspectable generated code

Consider alternatives when:

  • You’re already deep in Laravel and want framework integration (laravel-data)
  • You need advanced validation like conditional rules or cross-field dependencies
  • You want runtime-only, no build step (valinor)

Summary

php-collective/dto brings the best of code generation to PHP DTOs:

Aspect php-collective/dto Runtime Libraries
Performance 25-26x faster Baseline
IDE Support Excellent Good
Static Analysis Native Requires plugins
Code Review Visible generated code Magic/runtime
Build Step Required None

The library is framework-agnostic, well-documented, and actively maintained.

For many apps the performance overhead of reflection might not be relevant. After all, you might only have a few DTOs per template for simpler actions. But in the case that you are handling a huge amount of DTOs, a less magic way could be a viable option. At least it will be more efficient than trying to nano-optimize on other parts of the application.

Migration Path: From Arrays to DTOs

Adopting DTOs doesn’t have to be a big-bang rewrite. Here’s a practical, incremental path from raw arrays to fully typed DTOs — each step delivers value on its own.

Stage 0: The Array Wilderness

This is where most legacy PHP projects start. Data flows as associative arrays, and every access is a leap of faith:

// Controller
public function view(int $id): Response
{
$user = $this->Users->get($id, contain: ['Addresses', 'Roles']);
$data = $user->toArray();
// Pass array to service
$summary = $this->buildSummary($data);
return $this->response->withJson($summary);
}
private function buildSummary(array $data): array
{
return [
'full_name' => $data['first_name'] . ' ' . $data['last_name'],
'city' => $data['address']['city'] ?? 'Unknown', // exists?
'role_count' => count($data['roles'] ?? []), // array?
];
}

Problems: no autocomplete, no type safety, no way to know the shape without reading the query. A typo like $data['adress'] silently returns null.

Stage 1: Introduce DTOs at the Boundary

Start where it hurts most — the API response layer. Replace outgoing arrays with DTOs:

// Define the DTO config
Dto::create('UserSummary')->fields(
Field::string('fullName')->required(),
Field::string('city'),
Field::int('roleCount'),
);
vendor/bin/dto generate
// Controller — only the return type changes
public function view(int $id): Response
{
$user = $this->Users->get($id, contain: ['Addresses', 'Roles']);
$summary = new UserSummaryDto([
'fullName' => $user->first_name . ' ' . $user->last_name,
'city' => $user->address?->city,
'roleCount' => count($user->roles),
]);
return $this->response->withJson($summary->toArray());
}

The entity query stays the same. The service layer stays the same. But the API contract is now explicit, typed, and autocomplete-friendly. If someone removes city from the DTO config, the generator catches it.

Stage 2: Move DTOs Inward to Service Methods

Once boundaries are typed, push DTOs into service signatures:

// Before: what does this array contain? Who knows.
public function calculateShipping(array $order): float
// After: explicit contract
public function calculateShipping(OrderDto $order): float
{
$weight = $order->getItems()
->filter(fn(OrderItemDto $item) => $item->getWeight() > 0)
->sum(fn(OrderItemDto $item) => $item->getWeight());
return $this->rateCalculator->forWeight($weight, $order->getAddress());
}

Every caller now gets a compile-time check (via PHPStan) that they’re passing the right data. The method signature is the documentation.

Stage 3: Replace Internal Array Passing

Target the most common pattern — methods that return arrays of mixed data:

// Before
public function getStats(): array
{
return [
'total_users' => $this->Users->find()->count(),
'active_today' => $this->Users->find('activeToday')->count(),
'revenue' => $this->Orders->find()->sumOf('total'),
];
}
// Template: $stats['total_users'] — typo-prone, no autocomplete
// After
public function getStats(): DashboardStatsDto
{
return new DashboardStatsDto([
'totalUsers' => $this->Users->find()->count(),
'activeToday' => $this->Users->find('activeToday')->count(),
'revenue' => $this->Orders->find()->sumOf('total'),
]);
}
// Template: $stats->getTotalUsers() — autocomplete, type-checked

Stage 4: Use Projection for Read-Only Queries

For CakePHP 5.3+, skip the entity entirely on read paths:

// Before: full entity hydration, then manual mapping
$users = $this->Users->find()
->select(['id', 'email', 'name', 'created'])
->contain(['Roles'])
->all()
->toArray();
// After: straight to DTO, no entity in between
$users = $this->Users->find()
->select(['id', 'email', 'name', 'created'])
->contain(['Roles'])
->projectAs(UserListDto::class)
->all()
->toArray();

The query result maps directly into UserListDto objects. No entity overhead, no intermediate array step.

What to Migrate First

Not everything needs a DTO. Prioritize based on pain:

Priority Where Why
High API responses External contract, most likely to break silently
High Service method params Most frequent source of “what keys does this array have?”
Medium Template variables Autocomplete in templates reduces bugs
Medium Queue/event payloads Serialization boundaries need explicit shapes
Low Internal helper returns If only one caller exists, the overhead isn’t worth it
Skip Simple key-value configs Arrays are fine for ['timeout' => 30]

Rules of Thumb

  • Don’t convert everything at once. Start with the file you’re already editing.
  • One DTO per PR. Each conversion is a small, reviewable change.
  • Let the pain guide you. If you’ve been burned by a missing array key, that’s where the DTO goes.
  • Keep entities for writes. Entities handle validation, callbacks, and persistence. DTOs handle data transfer. They coexist.
  • Generated DTOs can wrap entities. Use UserDto::createFromArray($entity->toArray()) as a bridge during migration — no need to refactor the query layer first.

Demo

A live demo is available in the sandbox. Especially check out the “projection” examples that map the DB content 1:1 into speaking DTOs. The needed DTOs can be (re-)generated from the backend with a single click from the DB structure if needed.

Generated code is boring. Predictable. Fast.
Sometimes boring is exactly what you need.

php-collective/dto is available on Packagist. MIT licensed. PRs welcome.

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

Displaying maps in (Cake)PHP 4 Feb 9:22 AM (3 months ago)

Table of Contents

Breaking Free from Google Maps: Modern Open-Source Alternatives for CakePHP.

Almost 20 years ago I started to work with GoogleMaps already. Then slowly also Geocoding and other tooling was added. It all was grouped together into a useful and popular Geo plugin.

Nowadays there are many great alternatives, a lot of them open source or with free tiers, as well. I recently had a surprise invoice from Google, where a small pet project suddenly had 50-80 EUR per month for static map rendering.

With open-source/free tiers pet projects would not necessarily create these issues, rather not display a map, with seems safer for the beginning. They have like 5000 displays per day, which should be enough to get around, and also not create issues with bots or other ways of over-using it for some reason.

The Google Problem

Google Maps has been the go-to solution for web mapping for years, but the pricing model has become increasingly aggressive:

  • Pay-as-you-go pricing with no hard spending limits by default
  • Static maps, dynamic maps, and geocoding all count separately
  • Costs can spiral out of control with bot traffic or unexpected usage spikes
  • A simple pet project can suddenly generate significant monthly bills

This isn’t sustainable for hobby projects, personal websites, or even small business applications where mapping is a minor feature rather than the core product.

The Solution: A Multi-Provider Approach

The CakePHP Geo plugin has evolved to support a comprehensive set of alternatives. Instead of being locked into a single provider, you can now:

  • Use completely free open-source tile providers
  • Choose from multiple geocoding services with generous free tiers
  • Mix and match providers for different use cases
  • Set up fallback chains for reliability

Interactive Maps with Leaflet

The new LeafletHelper brings Leaflet.js to CakePHP – a lightweight, open-source JavaScript library that powers maps on thousands of websites.

Basic Usage

// In your controller or view
$this->loadHelper('Geo.Leaflet', ['autoScript' => true]);
// Create a map
$map = $this->Leaflet->map([
'zoom' => 13,
'lat' => 48.2082,
'lng' => 16.3738,
]);
echo $map;
// Add markers
$this->Leaflet->addMarker([
'lat' => 48.2082,
'lng' => 16.3738,
'title' => 'Vienna',
'content' => 'Welcome to <b>Vienna</b>!',
]);
$this->Leaflet->finalize();

Tile Provider Freedom

One of the biggest advantages of Leaflet is tile provider independence. The plugin includes built-in presets for popular free providers:

// OpenStreetMap (default)
$this->Leaflet->useTilePreset(LeafletHelper::TILES_OSM);
// CartoDB Light - great for data visualization
$this->Leaflet->useTilePreset(LeafletHelper::TILES_CARTO_LIGHT);
// CartoDB Dark - perfect for dark mode UIs
$this->Leaflet->useTilePreset(LeafletHelper::TILES_CARTO_DARK);

Or use any custom tile provider:

echo $this->Leaflet->map([
'zoom' => 10,
'lat' => 48.2082,
'lng' => 16.3738,
'tileLayer' => [
'url' => 'https://{s}.tile.opentopomap.org/{z}/{x}/{y}.png',
'options' => [
'attribution' => '© OpenStreetMap, © OpenTopoMap',
'maxZoom' => 17,
],
],
]);

Free Tile Provider Comparison

Provider Free Tier Best For
OpenStreetMap Unlimited* General purpose
CartoDB 75k/month Clean design, data viz
Stadia/Stamen 200k/month Artistic styles
OpenTopoMap Unlimited* Outdoor/hiking
Esri Unlimited* Professional maps
CyclOSM Unlimited* Cycling routes

*Fair use policy applies

Advanced Features

The LeafletHelper supports everything you’d expect from a modern mapping library:

// Auto-centering on markers
$map = $this->Leaflet->map(['autoCenter' => true]);
$this->Leaflet->addMarker(['lat' => 48.2, 'lng' => 16.3]);
$this->Leaflet->addMarker(['lat' => 47.0, 'lng' => 15.4]);
$this->Leaflet->finalize();
// Drawing shapes
$this->Leaflet->addPolyline(
['lat' => 48.2082, 'lng' => 16.3738],
['lat' => 47.0707, 'lng' => 15.4395],
['color' => '#ff0000', 'weight' => 5]
);
// Circles with radius
$this->Leaflet->addCircle([
'lat' => 48.2082,
'lng' => 16.3738,
'radius' => 5000, // meters
'fillOpacity' => 0.2,
]);
// GeoJSON support
$this->Leaflet->addGeoJson($geoJsonData);
// Marker clustering for large datasets
$this->Leaflet->enableClustering();

Static Maps Without Google

The new StaticMapHelper generates static map images from multiple providers – perfect for emails, PDFs, or pages where you don’t need interactivity.

Supported Providers

Provider Free Tier Sign-up
Geoapify 3,000/day geoapify.com
Mapbox 50k/month mapbox.com
Stadia 200k/month stadiamaps.com
Google Pay-as-you-go cloud.google.com

You can also create custom providers by extending the base provider classes if you need to integrate with other services.

Usage

$this->loadHelper('Geo.StaticMap');
// Basic static map
echo $this->StaticMap->image([
'lat' => 48.2082,
'lng' => 16.3738,
'zoom' => 12,
]);
// Switch providers easily
echo $this->StaticMap->image([
'provider' => StaticMapHelper::PROVIDER_GEOAPIFY,
'lat' => 48.2082,
'lng' => 16.3738,
'zoom' => 12,
'style' => 'osm-bright',
]);
// Add markers
echo $this->StaticMap->image([
'provider' => StaticMapHelper::PROVIDER_MAPBOX,
'zoom' => 12,
'markers' => [
['lat' => 48.2082, 'lng' => 16.3738, 'color' => 'red', 'label' => 'A'],
['lat' => 48.1951, 'lng' => 16.3715, 'color' => 'blue', 'label' => 'B'],
],
]);

Configuration

Set up your preferred provider globally:

// config/app_local.php
'StaticMap' => [
'provider' => 'geoapify',
'size' => '400x300',
'geoapify' => [
'apiKey' => env('GEOAPIFY_API_KEY'),
'style' => 'osm-bright',
],
],

Geocoding Alternatives

The Geocoder now supports multiple providers with automatic fallback:

// config/app_local.php
'Geocoder' => [
// Use Nominatim (free, OpenStreetMap-based) as default
'provider' => Geocoder::PROVIDER_NOMINATIM,
'nominatim' => [
'userAgent' => 'MyApp/1.0', // Required by OSM policy
],
],

Provider Fallback Chain

Set up automatic failover between providers:

'Geocoder' => [
'providers' => [
Geocoder::PROVIDER_NOMINATIM, // Try free option first
Geocoder::PROVIDER_GEOAPIFY, // Fall back to Geoapify
Geocoder::PROVIDER_GOOGLE, // Google as last resort
],
'nominatim' => [
'userAgent' => 'MyApp/1.0',
],
'geoapify' => [
'apiKey' => env('GEOAPIFY_API_KEY'),
],
'google' => [
'apiKey' => env('GOOGLE_MAPS_API_KEY'),
],
],

The chain automatically handles rate limiting and server errors, trying the next provider when one fails.

Geocoding Provider Comparison

Provider Free Tier API Key Notes
Nominatim 1 req/sec No OSM-based, requires user-agent
Geoapify 3,000/day Yes (free) Good accuracy
Google $200 credit/month Yes Best accuracy, expensive beyond credit

Testing with NullProvider

For unit tests, use the NullProvider to avoid external API calls:

'Geocoder' => [
'provider' => Geocoder::PROVIDER_NULL,
],

This returns predictable mock data, making your tests fast and reliable.

Migration Guide

Moving from Google-only to multi-provider is straightforward:

1. Interactive Maps: GoogleMapHelper to LeafletHelper

Before:

$this->loadHelper('Geo.GoogleMap');
echo $this->GoogleMap->map();
$this->GoogleMap->addMarker(['lat' => 48.2, 'lng' => 16.3]);
$this->GoogleMap->finalize();

After:

$this->loadHelper('Geo.Leaflet', ['autoScript' => true]);
echo $this->Leaflet->map();
$this->Leaflet->addMarker(['lat' => 48.2, 'lng' => 16.3]);
$this->Leaflet->finalize();

2. Static Maps

Before:

echo $this->GoogleMap->staticMap(['center' => '48.2,16.3', 'zoom' => 12]);

After:

$this->loadHelper('Geo.StaticMap');
echo $this->StaticMap->image([
'lat' => 48.2,
'lng' => 16.3,
]);

3. Geocoding

Before:

'Geocoder' => [
'apiKey' => env('GOOGLE_MAPS_API_KEY'),
],

After:

'Geocoder' => [
'provider' => Geocoder::PROVIDER_GEOAPIFY,
'geoapify' => [
'apiKey' => env('GEOAPIFY_API_KEY'),
],
],

Other New Features

Marker Clustering

When displaying many markers on a Leaflet map, clustering prevents visual clutter and improves performance:

$this->Leaflet->enableClustering();
// Add hundreds of markers
foreach ($locations as $location) {
$this->Leaflet->addMarker([
'lat' => $location->lat,
'lng' => $location->lng,
]);
}

Nearby markers are automatically grouped into clusters that expand when clicked or zoomed.

Spatial Queries with Index Support

For applications with larger datasets, the plugin now supports spatial queries using native database spatial functions. Instead of calculating distances purely in PHP or with basic SQL, you can leverage spatial indexes for significant performance improvements.

$query = $this->Addresses->find('spatial', [
'lat' => 48.2082,
'lng' => 16.3738,
'distance' => 100, // km
]);

The spatial finder uses a two-stage approach:

  1. A bounding box pre-filter with ST_Within() leverages spatial indexes to quickly eliminate distant records
  2. ST_Distance_Sphere() then calculates precise distances on the filtered result set

This works with MySQL 5.7+, MariaDB 10.4+, and PostGIS databases. For smaller datasets, the standard distance finder remains a simpler option.

Migration Setup

To use spatial queries, you need a POINT column with a spatial index. Here’s an example migration:

public function up(): void {
// Add coordinates column as nullable first
$this->table('addresses')
->addColumn('coordinates', 'point', ['null' => true])
->update();
// Populate from existing lat/lng data
$this->execute("
UPDATE addresses
SET coordinates = ST_GeomFromText(CONCAT('POINT(', lng, ' ', lat, ')'))
");
// Make NOT NULL with SRID 0 (required for spatial index)
$this->execute("
ALTER TABLE addresses
MODIFY COLUMN coordinates POINT NOT NULL SRID 0
");
// Add spatial index
$this->execute('ALTER TABLE addresses ADD SPATIAL INDEX coordinates (coordinates)');
}

Note: SRID 0 (Cartesian coordinate system) is required for the spatial index to work properly with ST_Within().

Keeping Coordinates in Sync

When lat/lng values change, the coordinates POINT column must be updated. You can handle this in beforeSave():

public function beforeSave(EventInterface $event, EntityInterface $entity): void {
// Only for MySQL/MariaDB
if (!$this->getConnection()->getDriver() instanceof Mysql) {
return;
}
if ($entity->isDirty('lat') || $entity->isDirty('lng') || $entity->isNew()) {
$entity->set(
'coordinates',
$this->getConnection()->newQuery()
->func('ST_GeomFromText', [sprintf('POINT(%s %s)', $entity->lng, $entity->lat)])
);
}
}

Alternatively, you can use a database trigger to automatically sync the coordinates column whenever lat/lng are inserted or updated.

Live Demos

See all these features in action at the Sandbox:

Conclusion

You no longer need to be locked into Google’s pricing model. The CakePHP Geo plugin now provides:

  • LeafletHelper – Full-featured interactive maps with free tile providers
  • StaticMapHelper – Multi-provider static maps with unified API
  • Geocoder – Multiple providers with automatic fallback chains
  • NullProvider – Clean testing without external dependencies

All with minimal code changes from your existing Google-based implementation.

Give your projects the freedom they deserve. Check out the live demos to get started.

For details on the latest release with those new features, check 3.7.0 Release Notes.

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

CakePHP Tips – 2026 Part 1 25 Jan 12:11 AM (3 months ago)

Table of Contents

Compact CLI output

If you are a nerd like me, you probably appreciate my recent 5.3 addition of a more concise default output of available commands. It already reduced scrolling by a lot, compared to before (which is now -v).

But for prod/staging, I often open up an even smaller CLI terminal, so here I still have to scroll way too much.

For this there is a neat new super-concise output available via Setup plugin v3.17.0+.

Either add it based on debug mode (only for prod), or just globally:

'Setup' => [
'compactHelp' => true,
],

Example output:

bin/cake
No command provided. Choose one of the available commands.
Available Commands:
- asset_compress [build|clear]
- audit_stash [cleanup]
- cache [clear|clear_all|clear_group|list]
- cli_test
- completion
- counter_cache
- current_config [configure|display|phpinfo|validate]
- database_logs [cleanup|export|monitor|reset|show]
- db [init|reset|wipe]
- db_backup [create|restore]
- db_data [dates|enums|orphans]
- db_integrity [bools|constraints|ints|keys|nulls]
- healthcheck
- help
- i18n [dump_from_db|extract|extract_to_db|init|validate]
- inflect
- issues
- mail_check
- mailer
- main
- maintenance_mode [activate|deactivate|status|whitelist]
- migrations [dump|mark_migrated|migrate|rollback|status]
- page_cache [clear|status]
- plugin [list|load|loaded|unload]
- plugin assets [copy|remove|symlink]
- queue [add|info|job|run|worker]
- real_notification
- reset
- routes [check|generate]
- scheduler [run]
- schema_cache [build|clear]
- seeds [reset|run|status]
- server
- tiny_auth [add|sync]
- user [create|update]
- user_notification
- version

Djot Templating

Use Djot templating with the Markup plugin to generate from readable syntax that is free of any HTML by default.

Powerful and versatile, while allowing technical writers, for example, to not have to use direct HTML. Everything is translatable into HTML upon rendering with customizations addable as opt-in.

  • Whole templates using DjotView
  • Partial templates
  • Code snippets or small elements (ideal for e.g. flash messages or alike)

and more.

Why Djot instead of Markdown?

First of all it is also more secure and therefore also perfect if not all users are “trustable” admins. It is also twice as fast.

Read the pros and syntax improvements in the linked repo above for details.

Whole Templates using DjotView

Render entire templates written in Djot syntax with .djot extension.

Controller:

// src/Controller/PagesController.php
public function documentation(): void
{
$this->viewBuilder()->setClassName('Markup.Djot');
$this->set('username', $this->Authentication->getIdentity()->username);
}

Template (templates/Pages/documentation.djot):

# Welcome, {{username}}!
This page was rendered from a `.djot` template file.
You can use all djot features:
- *Bold* and _italic_ text
- [Links](https://example.com)
- `Code blocks`
## Features
| Feature | Status |
|---------|--------|
| Tables | Yes |
| Lists | Yes |
| Code | Yes |

Partial Templates

Use the Djot helper to render partial content within regular PHP templates.

AppView setup:

// src/View/AppView.php
public function initialize(): void
{
$this->addHelper('Markup.Djot');
}

Template usage:

// templates/Articles/view.php
<div class="article-content">
<?= $this->Djot->convert($article->body) ?>
</div>
<aside class="sidebar">
<?= $this->Djot->convert($article->summary) ?>
</aside>

Code Snippets and Small Elements

Ideal for flash messages, notifications, or any small dynamic content.

Flash messages:

// src/Controller/UsersController.php
public function register(): void
{
if ($this->request->is('post')) {
// ... save logic
$this->Flash->success('Account created! Check your _email_ for the *activation link*.');
}
}

Custom flash element (templates/element/flash/default.php):

<?php
/** @var string $message */
?>
<div class="flash-message <?= h($class ?? 'info') ?>">
<?= $this->Djot->convert($message) ?>
</div>

Helper methods for inline content:

// In any template
$notice = '_Note:_ This action *cannot* be undone.';
echo $this->Djot->convert($notice);
// Tooltips or help text
$helpText = 'Use `Ctrl+S` to save or `Ctrl+Z` to undo.';
echo $this->Djot->convert($helpText);

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?

Djot PHP: A Modern Markup Parser 8 Dec 2025 8:59 PM (5 months ago)

Table of Contents

If you’ve ever wished Markdown was a bit more consistent and feature-rich, you’ll want to hear about Djot – and now there’s a complete PHP implementation available.

What is Djot?

Djot is a lightweight markup language by the author of Commonmark (Markdown) and Pandoc. It takes the best ideas from Markdown while addressing many of its ambiguities and limitations. The syntax is familiar yet more predictable, making it an excellent choice for content-heavy applications. You could call it somewhat a possible successor.

The php-collective/djot composer package brings full Djot support to PHP 8.2+, with 100% compatibility with the official djot test suite.

Use Cases

Let’s talk about common cases where such a markup language would be beneficial:

  • Blog engines and CMS platforms
  • Documentation systems
  • Technical writing applications
  • User-generated content (comments, forums) with Profile-based restrictions
  • Any project requiring lightweight markup with advanced formatting
  • Customizable to specific (business relevant) markup/constructs
  • Secure by design

Let’s see if Djot fits these needs.

Feature Highlights

Rich Text Formatting

Djot supports the familiar emphasis and strong formatting, plus several extras:

Syntax Result Description
*Strong* Strong Bold text
_Emphasized_ Emphasized Italic text
{=Highlighted=} Highlighted Highlighted text
{+Inserted+} Inserted Inserted text
{-Deleted-} Deleted Deleted text
`code` code Inline code
E=mc^2^ E=mc2 Superscript
H~2~O H2O Subscript

Smart Typography

Smart quotes, em-dashes, en-dashes, and ellipsis are handled automatically:

  • "Hello" becomes “Hello” with curved quotes
  • --- becomes an em-dash (—)
  • -- becomes an en-dash (–)
  • ... becomes an ellipsis (…)

Tables with Alignment

Full table support with column alignment:

| Feature | Status | Notes |
|:------------|:------:|--------:|
| Left-align | Center | Right |

Task Lists

Native checkbox support for task lists:

- [x] Create parser
- [x] Create renderer
- [ ] World domination

Since this post is written in Djot, here’s the actual rendered output:

  • Create parser
  • Create renderer
  • World domination

Divs with Classes

Create styled containers with the triple-colon syntax:

::: warning
This is a warning message.
:::

Renders as:

<div class="warning">
<p>This is a warning message.</p>
</div>

Live demo:

Note: This is a note block. Use it for tips, hints, or additional information that complements the main content.

Warning: This is a warning block. Use it to highlight important cautions or potential issues that readers should be aware of.

Spans with Attributes

Add classes, IDs, or custom attributes to inline content:

This is [important]{.highlight #key-point}

Code Blocks

Fenced code blocks with syntax highlighting hints:

```php
$converter = new DjotConverter();
echo $converter->convert($text);
```

Captions (Images, Blockquotes & Tables)

The ^ prefix adds a caption to the block immediately above it:

Block Type HTML Output
Image <figure> + <figcaption>
Table <caption> inside <table>
Blockquote <figure> + <figcaption>
> To be or not to be,
> that is the question.
^ William Shakespeare

Renders as:

To be or not to be, that is the question.

William Shakespeare

The Markdown Elephant in the Room

Let’s be honest: Markdown has quirks. Ever spent 20 minutes debugging why your nested list won’t render correctly? Or wondered why _this_works_ but _this_doesn't_ in some parsers?

Djot was designed by someone who knows these pain points intimately – John MacFarlane literally wrote the CommonMark spec. With Djot, he started fresh with lessons learned from years of Markdown edge cases.

The result? A syntax that feels familiar but actually behaves predictably. Your users write content, not workarounds.

Why Djot Over Markdown?

  • More consistent syntax – Fewer edge cases and ambiguities
  • Better nesting – Clear rules for nested emphasis and containers
  • Built-in features – Highlights, insertions, deletions, and spans without extensions
  • Smart typography – Automatic without additional plugins
  • Cleaner specification – Easier to implement correctly
  • Easier to extend – AST makes adding new features straightforward
  • Secure by design – Random unfenced HTML like <b>...</b> shouldn’t be treated as such blindly

Djot vs Markdown: Quick Comparison

Feature Markdown Djot
Strong **text** or __text__ *text*
Emphasis *text* or _text_ _text_
Highlight ❌ (needs extension) {=text=}
Insert/Delete ❌ (needs extension) {+text+} / {-text-}
Attributes ❌ (non-standard) [text]{.class #id}
Divs ❌ ::: classname
Smart quotes Depends on parser Always on
Nested emphasis Inconsistent Predictable
Hard line breaks Two trailing spaces Visible \ (backslash)

Trailing spaces are problematic since most IDEs and editors auto-trim whitespace. Using a visible \ character is much cleaner.

Auto-HTML is also problematic for user-generated content. Djot treats everything as text by default – you must explicitly enable raw HTML (see below).

Basic Usage

Converting Djot to HTML is straightforward:

use Djot\DjotConverter;
$converter = new DjotConverter();
$html = $converter->convert($djotText);

Need XHTML output? Just pass a flag:

$converter = new DjotConverter(xhtml: true);

Advanced Usage

For more control, you can work with the AST directly:

$converter = new DjotConverter();
// Parse to AST
$document = $converter->parse($djotText);
// Manipulate the AST if needed...
// Render to HTML
$html = $converter->render($document);

Markdown compatibility modes

Note: This is specific to this library and not yet officially in the specs. Using this in your apps means, your users get the best out of both concepts, but it also means you need to clarify and document this and cannot “just” link to djot specs.

Soft break mode

Configure soft breaks as per context and user needs:

Mode HTML Output Browser Display
Newline \n No visible break (whitespace collapsed)
Space No visible break (whitespace collapsed)
Break <br> Visible line break
$renderer = $converter->getRenderer(); // HtmlRenderer
// Default - newline in source, invisible in browser
$renderer->setSoftBreakMode(SoftBreakMode::Newline);
// Space - same visual result, slightly smaller HTML
$renderer->setSoftBreakMode(SoftBreakMode::Space);
// Break - every source line break becomes visible <br>
$renderer->setSoftBreakMode(SoftBreakMode::Break);

This actually allows a certain compatibility with users that are used to Markdown line breaking within normal text. So this is useful for chats or simple text inputs.

As this only affects the rendering, but not the parsing, this is still fully spec-compliant in that way.

Significant Newlines Mode (Markdown-Like)

This mode is for users accustomed to Markdown’s “human” behavior where newlines intuitively interrupt blocks.

The Djot specification states: “Paragraphs can never be interrupted by other block-level elements.”

In standard Djot, this means lists and other elements require blank lines before them – more “spaced” than what Markdown users expect.

There’s an easy solution to get the best of both worlds:

$converter = new DjotConverter(significantNewlines: true);
$result = $converter->convert("Here's a list:
- Item one
- Item two");
// Output: <p>Here's a list:</p>\n<ul><li>Item one</li><li>Item two</li></ul>

If you need a marker character (-, *, +, >) at the start of a line without triggering a block, use escaping:

// Without escaping - creates a list
$result = $converter->convert("Price:
- 10 dollars");
// Output: <p>Price:</p><ul><li>10 dollars</li></ul>
// With escaping - literal text
$result = $converter->convert("Price:
\\- 10 dollars");
// Output: <p>Price:<br>- 10 dollars</p>

This returns you to standard Djot behavior for that line.

This mode is useful when migrating existing systems where users expect Markdown-like behavior – most content works without changes, and the rare edge cases can be escaped. For offline docs and anything needed to be more agnostic one should still use the default spec compliant way.

Customization

Extension System

The library includes a clean, modern extension system, making common features trivial to add:

use Djot\DjotConverter;
use Djot\Extension\ExternalLinksExtension;
use Djot\Extension\TableOfContentsExtension;
use Djot\Extension\DefaultAttributesExtension;
$converter = new DjotConverter();
$converter
->addExtension(new ExternalLinksExtension())
->addExtension(new TableOfContentsExtension(position: 'top'))
->addExtension(new DefaultAttributesExtension([
'image' => ['loading' => 'lazy'],
'table' => ['class' => 'table table-striped'],
]));

Built-in Extensions

Extension Description
AutolinkExtension Auto-links bare URLs and email addresses
DefaultAttributesExtension Adds default attributes by element type (lazy loading, CSS classes)
ExternalLinksExtension Adds target="_blank" and rel="noopener noreferrer" to external links
HeadingPermalinksExtension Adds clickable anchor links () to headings
MentionsExtension Converts @username patterns to profile links
TableOfContentsExtension Generates TOC from headings with optional auto-insertion

The DefaultAttributesExtension is particularly useful:

$converter->addExtension(new DefaultAttributesExtension([
'image' => ['loading' => 'lazy', 'decoding' => 'async'],
'table' => ['class' => 'table table-bordered'],
'block_quote' => ['class' => 'blockquote'],
]));

Extensions can also be combined. For example, AutolinkExtension should be registered before ExternalLinksExtension so auto-linked URLs also get the external link attributes.

Custom Rendering with Events

For more control, use the event system directly:

use Djot\Renderer\Event\RenderEvent;
$renderer = $converter->getRenderer();
// Convert :emoji: symbols to actual emoji
$renderer->addEventListener('render.symbol', function (RenderEvent $event) {
$node = $event->getNode();
$emoji = match ($node->getName()) {
'smile' => '😊',
'heart' => '❤',
'rocket' => '🚀',
default => ':' . $node->getName() . ':',
};
$event->setHtml($emoji);
});
// Add target="_blank" to external links
$renderer->addEventListener('render.link', function (RenderEvent $event) {
$link = $event->getNode();
$url = $link->getDestination();
if (str_starts_with($url, 'http')) {
$link->setAttribute('target', '_blank');
$link->setAttribute('rel', 'noopener noreferrer');
}
});

Custom Inline Patterns

Need #hashtags or wiki-style links? The parser supports custom inline patterns:

use Djot\Node\Inline\Link;
use Djot\Node\Inline\Text;
$parser = $converter->getParser()->getInlineParser();
// #hashtags → tag pages
$parser->addInlinePattern('/#([a-zA-Z][a-zA-Z0-9_]*)/', function ($match, $groups) {
$link = new Link('/tags/' . strtolower($groups[1]));
$link->appendChild(new Text('#' . $groups[1]));
return $link;
});
echo $converter->convert('Check out #PHP and #Djot!');
// <p>Check out <a href="/tags/php">#PHP</a> and <a href="/tags/djot">#Djot</a>!</p>

Custom block patterns are also supported for admonitions, tab containers, and more. See the Cookbook for recipes including wiki links, math rendering, and image processing.

Feature Restriction: Profiles

SafeMode prevents XSS attacks, but what about controlling which markup features users can access? A comment section probably shouldn’t allow headings, tables, or raw HTML – not because they’re dangerous, but because they’re inappropriate for that context.

That’s where Profiles come in. They complement SafeMode by restricting available features based on context:

use Djot\Profile;
// Comment sections: basic formatting only
$converter = new DjotConverter(profile: Profile::comment());
// Blog posts: rich formatting, but no raw HTML
$converter = new DjotConverter(profile: Profile::article());
// Chat messages: text, bold, italic - that's it
$converter = new DjotConverter(profile: Profile::minimal());

SafeMode vs Profile

Concern SafeMode Profile
Purpose Security (XSS prevention) Feature restriction
Blocks javascript: URLs, event handlers Headings, tables, raw HTML
Target Malicious input Inappropriate formatting

Use both together for user-generated content:

$converter = new DjotConverter(
safeMode: true,
profile: Profile::comment()
);

Built-in Profiles

Each profile is designed for specific use cases:

  • Profile::full() – Everything enabled (admin/trusted content)
  • Profile::article() – Blog posts: no raw HTML, allows headings/tables
  • Profile::comment() – User comments: no headings/tables, adds rel="nofollow ugc" to links
  • Profile::minimal() – Chat: text, bold, italic only

Understanding Restrictions

Profiles can explain why features are restricted:

$profile = Profile::comment();
echo $profile->getReasonDisallowed('heading');
// "Headings would disrupt page hierarchy in user comments"
echo $profile->getReasonDisallowed('raw_block');
// "Raw HTML could bypass template styling and security measures"

Link Policies

Control where users can link to:

use Djot\LinkPolicy;
// Only allow links to your own domain
$profile = Profile::comment()
->setLinkPolicy(LinkPolicy::internalOnly());
// Or whitelist specific domains
$profile = Profile::comment()
->setLinkPolicy(
LinkPolicy::allowlist(['docs.php.net', 'github.com'])
->withRelAttributes(['nofollow', 'ugc'])
);

Graceful Degradation

When users try restricted features, content converts to plain text by default – nothing is lost:

$converter = new DjotConverter(profile: Profile::minimal());
$html = $converter->convert('# Heading attempt');
// Renders: <p>Heading attempt</p> (text preserved, heading stripped)

For stricter handling, you can strip content entirely or throw exceptions:

$profile = Profile::minimal()->setDefaultAction(Profile::ACTION_STRIP);
// Or for APIs:
$profile = Profile::minimal()->setDefaultAction(Profile::ACTION_ERROR);

Architecture

The package uses a clean separation of concerns:

  • BlockParser – Parses block-level elements (headings, lists, tables, code blocks, etc.)
  • InlineParser – Processes inline elements within blocks (emphasis, links, code spans)
  • HtmlRenderer – Converts the AST to HTML output

This AST-based approach makes the codebase maintainable and opens possibilities for alternative output formats.

There are also other compatibility renderers available, as well as converters to convert existing markup to Djot.

WordPress Plugin: Djot Markup for WP

Want to use Djot in your WordPress site? There’s now a dedicated plugin that brings full Djot support to WordPress.

Features

  • Full Content Processing – Write entire posts in Djot syntax
  • Shortcode Support – Use [djot]...[/djot] for mixed content
  • Syntax Highlighting – Built-in highlight.js with 12+ themes
  • Profiles – Limit functionality per post/page/comment type, disable raw HTML
  • Admin Settings – Easy configuration via Settings → WP Djot
  • Markdown compatibility mode and soft-break settings if coming from MD

Fun fact: I just migrated this blog from custom markdown-hacks to Djot (and wrote this post with it). For that I used the built in migrator of that WP plugin as well as a bit of custom migration tooling.

I needed to migrate posts, articles and comments – all in all quite straightforward though. The new interface with quick markdown-paste and other useful gimmicks helps to speed up technical blogging actually. It is both safe (comments use the right profile) and reliable.

The plugin also comes with useful semantic customization right away:

Djot Syntax HTML Output Output Use Case
[CSS]{abbr="Cascading Style Sheets"} <abbr title="...">CSS</abbr> CSS Abbreviations
[Ctrl+C]{kbd=""} <kbd>Ctrl+C</kbd> Ctrl+C Keyboard input
[term]{dfn=""} <dfn>term</dfn> term Definition term

On top, it has some gotchas as extensions:

  • ![Alt text](https://www.youtube.com/watch?v=aVx-zJPEF2c){video} renders videos from all WP supported sources right away, customize the attributes as always: {video width=300 height=200}
  • Import from HTML or markdown

You can extend the customizations also on your own.

IDE Support: IntelliJ Plugin

For developers using PhpStorm, IntelliJ IDEA, or other JetBrains IDEs, there’s now an official Djot plugin available.

Features

  • Syntax Highlighting – Full TextMate grammar support for .djot files
  • Live Preview – Split-view editor with real-time rendered output
  • Theme Sync – Preview follows your IDE’s dark/light mode
  • Code Block Highlighting – Syntax highlighting within fenced code blocks
  • HTML Export – Save documents as rendered HTML files
  • Live Templates – Code snippets for common Djot patterns

The plugin requires JetBrains IDE 2024.1+ and Java 17+.

Enhancements

The library and the WP plugin already have some useful enhancements beyond the spec:

  • Full attribute support
  • Boolean Attribute Shorthand
  • Fenced Comment Blocks
  • Multiple Definition Terms and Definition Descriptions
  • Abbreviations definitions
  • Captions for Images, Tables, and Block Quotes
  • Markdown compatibility mode (Significant Newlines)

These extend beyond the current spec but are documented as such. Keep this in mind if you need cross-application compatibility.

There is a highlight.js extension available to also code highlight djot content.

Performance

How fast is it? We benchmarked djot-php against Djot implementations in other languages:

Implementation ~56 KB Doc Throughput vs PHP
Rust (jotdown) ~1-2 ms ~30+ MB/s ~10x faster
Go (godjot) ~2-4 ms ~15+ MB/s ~5x faster
JS (@djot/djot) ~8 ms ~7 MB/s ~2x faster
PHP (djot-php) ~18 ms ~3 MB/s baseline
Python (markdown-it) ~37 ms ~1.5 MB/s ~2x slower*

*Python comparison uses Markdown parsers since no Djot implementation exists for Python.

Key observations:

  • PHP processes ~2-3 MB/s of Djot content consistently
  • Performance scales linearly O(n) with document size
  • Safe mode and Profiles have negligible performance impact
  • Comparable to Python, ~2x slower than JavaScript reference implementation

For typical blog posts and comments (1-10 KB), parsing takes under 5 ms. A 1 MB document converts in ~530 ms using ~44 MB RAM.

The performance documentation includes detailed benchmarks, memory profiling, and stress test results.

Comparison with PHP Markup Libraries

It is also interesting to compare it with other PHP parsers, usually markdown obviously:

Library 27KB Doc Throughput vs djot-php
erusev/parsedown 1.73 ms 15.6 MB/s 5.9x faster
michelf/php-markdown 5.26 ms 5.1 MB/s 1.9x faster
michelf/php-markdown (Extra) 6.12 ms 4.4 MB/s 1.7x faster
djot-php 10.22 ms 2.6 MB/s baseline
league/commonmark 16.17 ms 1.7 MB/s 1.6x slower
league/commonmark (GFM) 16.86 ms 1.6 MB/s 1.7x slower

No surprise:

  1. Simpler architecture – Parsedown uses a single-pass regex-based approach without building a full AST (Abstract Syntax Tree). It directly outputs HTML while parsing.
  2. No AST overhead – djot-php and CommonMark build a complete node tree first, then traverse it to render. This two-phase approach enables features (events, transforms, multiple output formats) but costs time.

Key finding with equivalent features enabled:

Library Time vs djot-php
djot-php 11.36 ms baseline
CommonMark (GFM) 15.00 ms 1.3x slower
CommonMark (Full) 23.54 ms 2.1x slower

Djot syntax was designed for efficient parsing

Feature Comparison

Feature djot-php CommonMark Parsedown Michelf
Basic formatting Yes Yes Yes Yes
Tables Yes GFM only Yes Extra
Footnotes Yes No No Extra
Definition lists Yes No No Extra
Task lists Yes GFM only No No
Smart typography Yes No No No
Math expressions Yes No No No
Attributes Yes No No Extra
Highlight/Insert/Delete Yes No No No
Super/Subscript Yes No No No
Divs/Sections Yes No No No
Event system Yes Yes No No
Safe mode Yes Yes Yes Yes
Profiles Yes No No No
Extension system Yes Yes No No

Features Unique to djot-php

  1. Smart Typography – Automatic curly quotes, em/en dashes, ellipsis
  2. Math Expressions – Inline $x^2$ and display $$ math
  3. Highlight/Insert/Delete{=highlight=}, {+insert+}, {-delete-}
  4. Super/SubscriptH~2~O, x^2^
  5. Divs with Classes::: warning ... :::
  6. Profile System – Restrict features per context (full/article/comment/minimal)
  7. Abbreviations – Auto-wrap terms with <abbr> tags
  8. Captions – For images, tables, and blockquotes
  9. Converters – Import from Markdown, BBCode, HTML

Importing and Migration

You can often with a boolean flag just continue to support the current markup, and with new content add djot based content. For those that want to migrate, there is some built in tooling and converters:

  • HtmlToDjot
  • MarkdownToDjot
  • BbcodeToDjot

Fun fact: They also serve as a nice round-trip validation, to check if the transformation from and to is loss-free. Send a doc into it and reverse it, and the content should still “match” without loss of supported structures.

What’s Next?

The library is actively maintained with plans for:

  • Additional renderers (convert Djot back for interoperability)
  • More converters
  • More markup supported (not contradicting the specs)
  • Maybe some framework specific plugins or integrations

Contributions welcome!

Some personal notes

I would have liked URLs and images to have a bit more friendly syntax as well, e.g. [link: url "text"] style for links and [image: src "alt"] style for images. The ![](url) style still feels a bit too much like code syntax to me.

If I were ever to invent a new markup language, I would probably take a similar approach, but try to keep it even simpler by default. The {} braces seem a bit heavy for these common use cases, and for non-technical users.

One of the quirks I had to get used to, was the automated flow (line breaks are ignored) and the need for the visible (hard) line break if really desired. But in the end it usually helps to keep clear paragraphs. And I added compatibility options as opt-in for upgrading or usability ease.

Overall, Djot strikes a great balance between familiarity and consistency. And at least topics like URL/image can be easily added as extension if desired.

The PHP implementation with djot-php library is the most complete implementation of the standard available. It is perfectly suited for web-based usage. Make sure to check out the live sandbox and play around with the complex examples!

Links

Give Djot PHP a try in your next project. The familiar syntax with improved consistency and a lot more out of the box might just win you over.

Add post to Blinklist Add post to Blogmarks Add post to del.icio.us Digg this! Add post to My Web 2.0 Add post to Newsvine Add post to Reddit Add post to Simpy Who's linking to this post?