Creating Sniffs for a PHPCS Standard

As a follow-up to my summary of using phpcs to lint your PHP code, this post will explain how to create, modify, and test a phpcs standard.

As a quick refresher, phpcs organizes its linting messages (warnings and errors) into “sniffs”, then into “categories”, then into a “standard”.

The official tutorial for creating a standard is here, but it doesn’t go into much detail and some parts of it are out-of-date. There’s additional useful information in the 3.0 upgrade guide but I’m going to try to do a better job here of explaining how it all works.

File Structure

When creating a sniff standard project, it should use something like the following directory structure. (Other patterns are possible, but the nested directory structure from StandardName down to the sniff file is important to get correct sniff codes.)

project-name
├── StandardName
│   ├── Sniffs
│   │   └── MyCategory
│   │   ├── MyCustomSniff.php
│   ├── ruleset.xml
└── composer.json

In the above example, project-name is the repository for the standard, StandardName is the name of the standard, MyCategory is the sniff category, and MyCustomSniff.php is the actual sniff file which produces errors and warnings. The sniff file must end in ...Sniff.php.

The ruleset.xml file can be simple, but must include the standard name as the name attribute in a ruleset tag. Here is an example:

<?xml version="1.0"?>
<ruleset name="StandardName">
 <description>My standard.</description>
</ruleset>

Because dealerdirect/phpcodesniffer-composer-installer (highly recommended for installing sniffs on a project) looks for other composer packages with the type phpcodesniffer-standard, when we set up our composer.json file, we should use that as the type field. Thus, the composer.json file should look something like the following.

{
    "name": "owner/phpcs-standard-name",
    "description": "A set of phpcs sniffs for modern php development.",
    "type": "phpcodesniffer-standard",
    "license": "MIT",
    "authors": [
        {
            "name": "Payton Swick",
            "email": "payton@foolord.com"
        }
    ],
    "require": {},
    "require-dev": {}
}

Inside the sniff files themselves, there must be a namespace, which is the standard name, followed by the namespace Sniffs, followed by the category name. Just like the file structure, this is required to get a correct sniff code. Here’s an example that would be in the file MyCustomSniff.php.

namespace StandardName\Sniffs\MyCategory;

The sniff itself is a class that implements the PHP_CodeSniffer\Sniffs\Sniff interface and has at least two functions, register and process.

I’ll explain how to use them below, but to see the official docs about how to use these functions, see the phpcs wiki.

Writing a Sniff Class

To understand how to write a sniff you have to understand basically how PHP is interpreted.

A PHP file is split up into “tokens”, which are kind of like words or pieces of the code. Here is an example expression.

$foo = 'bar';

Here are the tokens for that expression (showing only the code and content fields).

T_VARIABLE => $foo
T_WHITESPACE => ·
T_EQUAL => =
T_WHITESPACE => ·
T_CONSTANT_ENCAPSED_STRING => 'bar'
T_SEMICOLON => ;
T_WHITESPACE => \n

Each token is an associative array of data that contains a “type” (also called its “code”) and meta-data about that token. All tokens have certain fields like code and content, but the other fields vary depending on the token.

All the tokens from a file are put into an indexed array as they are read. This array is sometimes called the “token stack”.

The register() function of your sniff class must return an array of token types that your sniff is interested in examining. Token types are constants built-in to PHP and are listed in the PHP docs. phpcs also adds some special tokens so it can be helpful to review the list used by its tokenizer as well.

The process() function will be called once for each token the parser finds that matches one of the types returned by register().

The process() function then checks tokens and decides if it should mark them as warnings or errors. It receives two arguments: an instance of PHP_CodeSniffer\Files\File and an integer which is the index of the token currently being processed in the token stack (sometimes this is called the “stack pointer”). Here’s an example signature of the process() function.

public function process(File $phpcsFile, $currentTokenIndex);

To inspect the current token or any other tokens in the stack, process() can call the getTokens() method on the File object. That will return an indexed array of all the tokens in the current file. Let’s assume we assign that to the variable $tokens as follows. To examine the current token, we can use $tokens[$currentTokenIndex].

$tokens = $phpcsFile->getTokens();
$tokens[$currentTokenIndex]; // the current token

As you develop a sniff, it’s often helpful to examine what data is available in each token. Here’s an example of doing that for each T_FUNCTION token in a file:

namespace StandardName\Sniffs\MyCategory;

use PHP_CodeSniffer\Sniffs\Sniff;
use PHP_CodeSniffer\Files\File;

class ExamineTokensSniff implements Sniff {
    public function register() {
        return [T_FUNCTION];
    }

    public function process(File $phpcsFile, $currentTokenIndex) {
        $tokens = $phpcsFile->getTokens();
        var_dump($tokens[$currentTokenIndex]);
    }
}

You can change the token index to examine other tokens in the stack, relative to the current one. For example, the following code would output the token immediately following the current token.

$tokens = $phpcsFile->getTokens();
var_dump($tokens[$currentTokenIndex + 1]);

You might want to find the beginning and end of a block of code, for example the ending bracket of a bracket pair, and this information is encoded in the token array. Here’s an example of what a token might look like for an open curly brace in a function definition (converted to JSON).

{
  "type": "T_OPEN_CURLY_BRACKET",
  "code": "PHPCS_T_OPEN_CURLY_BRACKET",
  "content": "{",
  "line": 3,
  "column": 34,
  "length": 1,
  "bracket_opener": 16,
  "bracket_closer": 167,
  "scope_condition": 10,
  "scope_opener": 16,
  "scope_closer": 167,
  "level": 1,
  "conditions": {
    "1": 361
  }
}

Note that the token has a property called bracket_closer which contains the stack index of the closing curly brace (in this case it’s at index 167). We can use this information to find any tokens inside the function body (any token with an index greater than 16 and less than 167).

There’s quite a lot to be said about examining tokens, and doing so is the bulk of writing sniffs, but I’ll get back to that in a minute.

Once we have identified a token that we want to mark as an error or a warning, we can call the addError() or addWarning() methods of the phpcs File object.

To review, each error or warning in a sniff has a “sniff code” which identifies it explicitly. A full sniff code includes the standard, the category, the sniff, and the error or warning, all separated by periods.

Standard.Category.Sniff.Error

Each error or warning (collectively known as messages) must have a string to report and a code to put on the end of the full sniff code. For example, if we want to mark the current token as an error with the message “This line is boring” and the error code “Boring”, we would do the following.

$error = 'This line is boring';
$phpcsFile->addError($error, $currentTokenIndex, 'Boring');

Assuming that was in our example class ExamineTokenSniff above, we would end up with a sniff code of StandardName.MyCategory.ExamineTokens.Boring.

Examining Tokens

Browsing the token stack manually can be troublesome and error-prone, so the PHP_CodeSniffer\Files\File object has some helper methods you can use to examine tokens other than the current one.

There are pear docs for this class, but probably the most commonly used helpers are the following.

  • string|null getDeclarationName(int $stackPtr): Returns the declaration names for classes, interfaces, traits, and functions.
  • int|bool findNext( int|array $types, int $start, [int $end = null], [bool $exclude = false], [string $value = null], [bool $local = false]): Returns the position of the next specified token(s).
  • int|bool findPrevious( int|array $types, int $start, [int $end = null], [bool $exclude = false], [string $value = null], [bool $local = false]): Returns the position of the previous specified token(s).
  • int findFirstOnLine( int|array $types, int $start, [bool $exclude = false], [string $value = null])

findNext and findPrevious are powerful, but have complex signatures.

The first argument $types is one of the PHP token type constants, or an array of them. Typically this is the types of tokens we’re looking for, but it can also be tokens we’re not looking for, if the fourth argument $exclude is true.

The second argument is the token stack pointer where we want to start looking.

The third argument is the token stack pointer where we want to stop looking. If the function does not find a matching token before it reaches this index, it will return false. If this argument is null (the default), the search will continue through the entire file until it finds a match, unless the $local argument is also set (see below).

The fourth argument, $exclude, seems like it would be an array of tokens to exclude from the search, but it is actually a boolean switch. If true, the function will search instead for tokens which do not match the rest of the arguments. This can be useful, for example, to search for something like “the next non-whitespace token”.

If the fifth argument, $value, is set, the search will compare the $token['content'] of each token to $value and ignore tokens for which they are not strictly equal.

The last argument, $local, can be used to stop searching at the next semi-colon (the end of the line). You can also use the function findFirstOnLine(), which is similar.

Note that findNext and findPrevious, if they match a token, return the index of the token, not the token itself. It’s still necessary to inspect the token in the stack if we want to learn about it.

As an example, here are three ways to get the name of a function when examining a T_FUNCTION token in the stack (inside an implementation of process(File $phpcsFile, $stackPtr)).

Using stack math (this might fail if there is more than one space between function and the function name):

$tokens = $phpcsFile->getTokens();
if (! isset($tokens[$stackPtr+2])) {
  return;
}
$functionNameToken = $tokens[$stackPtr+2];
$functionName = $functionNameToken['content'];

Using findNext:

$tokens = $phpcsFile->getTokens();
$nextPtr = $phpcsFile->findNext(T_STRING, $stackPtr, null, false, null, true);
if (! $nextPtr || ! isset($tokens[$nextPtr])) {
  return;
}
$functionNameToken = $tokens[$nextPtr];
$functionName = $functionNameToken['content'];

Using getDeclarationName:

$functionName = $phpcsFile->getDeclarationName($stackPtr);

Example Sniff

Here is an example which looks for a disallowed method name:

<?php

namespace StandardName\Sniffs\MagicMethods;

use PHP_CodeSniffer\Sniffs\Sniff;
use PHP_CodeSniffer\Files\File;

class DisallowMagicSetSniff implements Sniff {
    public function register() {
        return [T_FUNCTION];
    }

    public function process(File $phpcsFile, $stackPtr) {
        $functionName = $phpcsFile->getDeclarationName($stackPtr);
        if ($functionName === '__set') {
            $error = 'Magic setters are not allowed';
            $phpcsFile->addError($error, $stackPtr, 'MagicSet');
        }
    }
}

Often the best thing to do is to look at other sniffs that do similar things (remember that there are several large standards built-in to phpcs which will be installed in your project).

Another very helpful technique is to get a raw dump of all the token types in a file. You could do this manually in the code, but the phpcs command-line tool has a way to do this easily by turning up its verbosity using the -vvvv option as follows.

vendor/bin/phpcs -s -vvvv index.php

This will print out the codes of the token stack as it is processed, although you’ll have to output the token array in your code if you want to look at the tokens themselves.

Writing Tests for Sniffs

phpcs has a built-in mechanism to write PHPUnit tests called PHP_CodeSniffer\Tests\Standards\AbstractSniffUnitTest, but because it assumes that your standards are all installed globally, I’ve been unable to get it to work properly inside a project directory. Instead, I’ll show you how to write unit tests manually, which really isn’t that hard.

You can organize the files for your tests in any way you like, but I suggest that you organize them under a tests directory in the same manner as you have organized your sniff files. So if you have a sniff file at StandardName/Sniffs/MyCategory/ExamineTokensSniff.php then the test for that file would be under tests/StandardName/Sniffs/MyCategory/ExamineTokensSniffTest.php.

Fixtures

While it’s possible to use phpcs to scan just a block of code, it’s much closer to actual usage to scan a whole file in your tests. To that end, we’ll create a fixture file (or several of them) for your tests. Each fixture is just a regular php file that should trigger the errors in your sniff. I name the file “fixture.php” (although you can use whatever name seems appropriate) and put it in the same place as the test. Sometimes it’s helpful to have multiple fixture files for testing a single sniff.

For example, if your sniff prevents using the extract() function, then the fixture for that test might look like the following.

<?php
class MyClass {
    public function doSomething() {
        $all = ['foo' => 'bar' ];
        extract($all);
    }
}

If my sniff works, then this should produce an error on line 5. I also like to include counter-tests in my fixtures to be sure that the sniff does not trigger when it shouldn’t. So in this case I might include other uses of the word extract to be sure they do not show up as errors.

<?php
class MyClass {
    public function doSomething() {
        $all = ['foo' => 'bar' ];
        extract($all);
    }

    public function extract() {
        $extract = 'foo';
        $extract;
    }
}

Linting a File

To run phpcs on our fixture file, we need to create an instance of PHP_CodeSniffer\Files\LocalFile. Its constructor requires three arguments: a path to the fixture file we are going to lint, an instance of PHP_CodeSniffer\Ruleset, and an instance of PHP_CodeSniffer\Config.

The Config we’ll need for testing is pretty simple. I use the following, with no settings, but you can find a list of the available settings in the code itself, which includes the defaults.

$config = new Config();

If you do need to set config settings, you can specify them directly as object properties thanks to the use of PHP’s magic accessor functions, such as $config-&gt;cache = false;. In this case you can skip adding the sniffs to the Config because they will be provided directly to the Ruleset.

The Ruleset is the object that will load the sniffs we want to run. To create it, we need to pass its constructor the Config we’ll also be passing to the LocalFile. Then we call the registerSniffs() method and pass in the path of each class file for the sniffs we want to run in the test. Note that even if you want to run only one sniff class, you’ll have to pass an array.

$sniffFiles = ['path/to/MyCustomSniff.php'];
$ruleset = new Ruleset($config);
$ruleset->registerSniffs($sniffFiles, [], []);
$ruleset->populateTokenListeners();

The second and third arguments to registerSniffs() allow filtering sniffs and can be empty arrays for the purposes of testing as we will only pass in the sniffs we want to run.

We must also call populateTokenListeners() to prepare the sniffs to be run.

Finally we can create our LocalFile instance and run our sniffs on the fixture by calling the process() method.

$phpcsFile = new LocalFile($fixtureFile, $ruleset, $config);
$phpcsFile->process();
$foundErrors = $phpcsFile->getErrors();
$foundWarnings = $phpcsFile->getWarnings();

Once it has run, we can access the warnings and errors (collectively called “messages”) by calling getWarnings() and getErrors() on the LocalFile object.

The result of those methods is a multi-layered associative array (be careful because the keys are integers, but they are non-sequential). The first layer’s key is the line number of the message as an integer. The associated value is another array keyed by the column number of the message. The value of that layer is an indexed array of messages which apply at that line and column (there can be more than one).

Each message is an associative array that looks like the following (converted here to JSON):

{
  "message": "Extract is not allowed",
  "source": "StandardName.Extract.DisallowExtract.Extract",
  "listener": "StandardName\Sniffs\Extract\DisallowExtractSniff",
  "severity": 5,
  "fixable": false
}

Test Assertions

In my tests, I find it’s enough to just verify that the line numbers of each message are correct, since I’m generally only running one sniff at a time. In the following example we verify that the only error in the fixture file is on line 7.

$foundErrors = $phpcsFile->getErrors();
$lines = array_keys($foundErrors);
$this->assertEquals([7], $lines);

Here’s a full phpunit test for the extract sniff.

use PHPUnit\Framework\TestCase;
use PHP_CodeSniffer\Files\LocalFile;
use PHP_CodeSniffer\Ruleset;
use PHP_CodeSniffer\Config;

class DisallowExtractSniffTest extends TestCase {
    public function testDisallowExtractSniff() {
        $fixtureFile = __DIR__ . '/fixture.php';
        $sniffFiles = [__DIR__ . '/../../../StandardName/Sniffs/Extract/DisallowExtractSniff.php'];
        $config = new Config();
        $ruleset = new Ruleset($config);
        $ruleset->registerSniffs($sniffFiles, [], []);
        $ruleset->populateTokenListeners();
        $phpcsFile = new LocalFile($fixtureFile, $ruleset, $config);
        $phpcsFile->process();
        $foundErrors = $phpcsFile->getErrors();
        $lines = array_keys($foundErrors);
        $this->assertEquals([7], $lines);
    }
}

That’s not too complex, but it includes a lot of boilerplate. To reduce it a bit, but still leave the intent clear, I created the following helper class.

Using that class, the test becomes the following.

<?php
use PHPUnit\Framework\TestCase;

class DisallowExtractSniffTest extends TestCase {
    public function testDisallowExtractSniff() {
        $fixtureFile = __DIR__ . '/fixture.php';
        $sniffFile = __DIR__ . '/../../../StandardName/Sniffs/Extract/DisallowExtractSniff.php';
        $helper = new SniffTestHelper();
        $phpcsFile = $helper->prepareLocalFileForSniffs($sniffFile, $fixtureFile);
        $phpcsFile->process();
        $lines = $helper->getErrorLineNumbersFromFile($phpcsFile);
        $this->assertEquals([7], $lines);
    }
}

We could also create a trait or test case base class which simplifies the boilerplate even further, but for now I’ll leave that as an exercise for the reader.

Test Running

To include the helper and to load the required phpcs classes, we’ll need a phpunit bootstrap file like the following.

We can then make it easy to execute our tests by adding the following lines to our composer.json (yes, you could put the phpunit options in a config file too).

    "scripts": {
        "test": "./vendor/bin/phpunit --bootstrap ./tests/bootstrap.php ./tests"
    }

You’ll also need to install phpunit itself and phpcs.

composer require --dev phpunit/phpunit
composer require --dev squizlabs/php_codesniffer

Then you’ll be able to run your tests with the following simple command.

composer test

Hopefully this guide will help everyone to write sniffs that enforce code patterns and reduce the cognitive load of developers. It’s possible to write sniffs that automatically fix underlying code, but that topic will be a later post.

(Photo by Wil Stewart on Unsplash)

Author: Payton Swick

Vegan. Digital craftsman. Tea explorer. Avid learner of things. Writes code @automattic.

6 thoughts on “Creating Sniffs for a PHPCS Standard”

  1. All in all you mimic the behavior from Codesniffer’s own `AllSniffs.php` tests. I agree that the tests supplied by Codesniffer have their quirks – not using PHPUnit’s assert methods for one – but it also tests the fixer. Does the simpler call to the test suite justify using a number of Codesniffer internals like `Config` and `LocalFile`?

    1. All in all you mimic the behavior from Codesniffer’s own `AllSniffs.php` tests.

      My main problem with the AllSniffs.php tests is actually that I wasn’t able to get them to run without the standard being installed in a PHPCS installation itself. That is, IMO, impractical for regular development of standards. It makes it hard to run tests in environments like a CI server where it’s expected that you can just run `composer install && composer test` in a project and get a result (for example, take a look at the hoops that the WPCS standard had to go through to get tests to work). The difference with the approach described here is mostly just taking the magic out of AllSniffs.php and allowing developers to write tests how they like.

      it also tests the fixer.

      That can be done as well with minimal extra code, although I haven’t written up instructions here yet. And what’s nice is that you can write tests for the fixer behavior separately (since it is often quite a lot more complex). For example, see this set of array tests which test both the error messages as well as the fixer.

      1. > the hoops that the WPCS standard had to go through

        Talked to one of the maintainers of that this weekend. That is how I got the idea of using PHPCS’s functionalities.

        What I did now is copy `AllSniffs` to my project (= coding standard) and replaced the code to fetch the available coding standards to search in the project. A quick gist: https://gist.github.com/timoschinkel/15effe0830bf5edd5bdaf69c5c0235f1

        It would be awesome if PHPCS would move the fetching of standards to a separate method, so I don’t have to copy a large portion of the test.

        Off course you are free to use your own methodologies, I was just curious 🙂

    2. All the above said, I totally agree that it would be nice if there were some convenience methods/classes exposed by PHPCS or by another package to improve the PHPCS testing experience without having to create all these things by hand for each project. It wouldn’t be that hard to turn the above into some boilerplate helpers. I’ve considered publishing something myself, but I’m still trying to determine the most useful way for that to work. And maybe someone more clever will beat me to it! 😁

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s