Copying files… sometimes

File this one under “tools that probably only I will find useful”. In the course of my normal job I need to copy files to a synchronized directory on my computer (something like a DropBox folder). The files are JavaScript code that’s been transpiled and copying them to the synchronized directory is what deploys them to my staging server.

Therefore, as I work, I need to: Code → Transpile → Deploy → Test → Repeat. My ideal work cycle is more like: Code → Test → Repeat. I want to be able to just write code and then reload my browser, not all that boring stuff in the middle.

The process of transpiling my code can be handled by a watcher (usually grunt-watch or watchify) so I don’t have to worry about that part. Unfortunately because of the deploy step my process still looks like: Code → Deploy → Test → Repeat. (Grumble grumble inefficient grumble.)

Now, one way I could handle this is just to move my whole project into the synchronized directory. That way the transpiled files are already in the right place and I don’t need a deploy step. That feels like cheating to me, though. It means that my project directory has to exactly mirror the deploy directory and it means that all of my working files are also being synchronized as I code; that’s a lot of unnecessary data over the wire.

Naturally I decided to automate my way out of this problem. I thought to myself: I could just add another task so that the watcher copies my transpiled code to the synchronized directory when it’s done! Ah, but I’d need to give it an absolute path, and several developers work on this project, each with different synchronized directories. That’s not ideal. And even worse is that another developer might want to handle the deploy process differently.

This led me to write copytotheplace. It’s a very simple library and command-line tool that will allow copying files to a directory by setting the destination as an environment variable or using a config file (or a command-line option or parameter if you’re using the library directly).

If no destination directory is set, this tool will do nothing, which allows it to be used as the last step in a build pipeline without doing anything unless specifically called-for.

To hook it into my particular build tool and watcher, I wrote grunt-copytotheplace which just loads the library into a Grunt task.

Now I just put COPYTOTHEPLACE=/path/to/my/sync/directory in the .env file in my local project directory and Grunt will copy the files there every time they change. More importantly, when other developers who don’t have that option set run their version of Grunt, nothing will be copied anywhere.

I know, it’s a weird solution to a weird problem, but it was a simple way to dramatically speed up my workflow without harming others, and so for now I consider it a win. Maybe next week I’ll come up with a better way. But in the mean time, if you find this tool useful it’s up there in the cloud for all to share. Just npm install copytotheplace and away you go! (See details in the README.)

Partial application and making tea

Partial application is like making tea. The person making the tea needs two pieces of information: what kind of tea, and how many people to serve. We can save time by knowing one of those pieces of information before we begin.

Let’s say we have a tea shop. Whenever a new person comes in with a group, you need to ask them two questions: what kind of tea would they like, and how many cups do they need? After that, you know how to do the rest. Of course, sometimes the customer hems and haws about what kind of tea so it can take a while…

But then the door opens and one of your regulars comes in. You know that she always asks for a green tea, but you’re never sure who she’s coming to meet, so you still have to ask her how many cups she’d like. Because you already know the kind of tea, you only need one piece of information instead of two. You’ve kept the first piece of information in your memory. This is partial application.

When programming, you may be writing a function that takes two (or more!) arguments. If you often use that function in a particular way, you can prepare it by creating a partially applied version that only takes one argument instead of two, keeping the missing argument in memory (often through the clever use of closures). Then when you want to call it, you can call the partially applied version.

Partial application means taking a function and partially applying it to one or more of its arguments, but not all, creating a new function in the process. (Dave Atchley)

A little harder to explain is why you’d want to do that. Well, for me the main reason is to be able to pipe the result of a series of functions together.

That is, when you have a series of functions that need to be run on the same piece of data, you might write something like:

adjustedData = adjustData( data );
preparedData = prepareData( properties, adjustedData );
doSomething( preparedData );

That works fine, but in some cases it can get messy. Also, we’ve created two temporary variables whose entire purpose is to pass the result of one function on to the next. We can do better!

doSomething( prepareData( properties, adjustData( data ) ) );

Ok, that gets rid of the variables, but it’s way less readable. Just think what it would look like if there were ten functions in that chain! What we need is a way to pipe data from one function to the next just like the | character in UNIX shells. There is a type of function which does this, sometimes called “pipe”, “compose”, or “flow”.

composedFunction = flow(
  adjustData,
  prepareData,
  doSomething
);
composedFunction( data );

Wow, that’s so much better! It’s even easy to add or remove steps in the chain. But you may have noticed that I skipped a step: prepareData takes two arguments, so how can we get that second argument in there?

What we need is a way to transform prepareData from a function that accepts two arguments into a function that accepts one argument. This is partial application.

prepareDataWithProperties = partiallyApply( prepareData, properties );
composedFunction = flow(
  adjustData,
  prepareDataWithProperties,
  doSomething
);
composedFunction( data );

Now the partially applied version can be used in our pipe because its first argument is already saved. This is like knowing what kind of tea to make already. We go from needing two pieces of information to only needing one.

There are many ways to actually use partial application in practice. In JavaScript we can use bind and the Lodash flow methods to achieve the above example. But it’s a big topic and there’s a lot of other options available. I’m still learning about them myself.

Avoiding Tightly-Coupled REST APIs

As you might have read, WordPress is getting a powerful REST API. For a few years now I’ve been writing endpoints for the WordPress.com version of this API, as well as few REST APIs in other languages (Ruby, JavaScript). As I’m also a big fan of unit testing, I’d like to share a pattern that I’ve learned from trial and error.

To make REST API endpoints more testable, as well as more useful, write the logic in a separate library.

Let’s say I have an endpoint that updates a menu item on a restaurant web site. In WordPress I have the menu item set up as a custom post type (you don’t need to grok WordPress to benefit from this pattern, but I thought I’d mention it).

The basic operation of the endpoint is this:

  1. Take the post ID and the new field values from the data submitted to the endpoint.
  2. Find the matching post in the database.
  3. Verify user permissions to update that post.
  4. Calculate the menu item’s total price based on current tax rate and ingredients.
  5. Update the post’s data.
  6. Return the newly updated post data to the client.

I’d also like to write unit tests for each of the above operations.

Now, the first step in writing my endpoint would be to map the HTTP route to the route handler. That is, I need to map the route PUT /menu/item to a function which kicks off the process above and eventually returns the result to the client.

I could write all of the above logic right in that function. There’s even a few pieces which could be their own separate functions. But this way may not be ideal, as I’ll explain.

The route handler, wherever it might be defined, is aware that it’s part of a REST API. The exact details of that vary depending on which language, architecture, etc, that you are using. But generally the initial route handler is not just doing the work of the steps above, but also interacting in some way with the API. Perhaps it’s checking details of the path, query string, user token, or it’s sending back an HTTP-specific response.

Here’s where we can get into trouble. Let’s say later on we decide to write another REST API endpoint that accepts a new tax rate and updates all the menu items. It needs to do the following:

  1. Take the new tax rate from the submitted data.
  2. Iterate over each menu item post.
  3. Validate user permissions to update each post.
  4. Calculate each menu item’s new total price.
  5. Update each post.

Notice that steps 3, 4, and 5 are the same as the previous endpoint we made. Rather than re-write them here, let’s keep things DRY and just call the other endpoint; thus our new update-the-tax-rate endpoint looks more like this:

  1. Take the new tax rate from the submitted data.
  2. Iterate over each menu item post.
  3. Call the update-menu-item endpoint.

Ah… but in the WordPress REST API, as well as many other API frameworks in different languages, you can’t call an API endpoint from within another API endpoint. And it might be said that, even if you could, doing so adds unnecessary overhead.

In this case, if the process of finding, validating, and updating a post was all one function call in a library, maybe update_menu_item(), then this endpoint becomes so much easier. Both the update-menu-item endpoint and the update-the-tax-rate endpoint can just call update_menu_item().

Even better, so can our unit tests! While it’s a good idea to write tests for an API endpoint that actually call the API, doing so can sometimes be problematic. Unit tests usually try to test one specific behavior, separating that behavior from the rest of the system. If you test a REST API, the system being tested is on a separate server (even if the server is local to your computer), and is thus much more complicated to control. But if the API’s route handler simply calls a function in a library, we can just test that function, providing mock data and checking the results, all without even considering HTTP.

Certainly this isn’t a rule, as most rules in programming tend not to fit real-life situations. It’s just a pattern I’ve decided works well for me. I suggest you examine your own code to see if it might work for you too!

get_deep_value in PHP

Ever have an array in PHP that looks like this?

$data = [ 'one' => [ 'two' => [ 'four' => 'bar', 'three' => 'foo' ] ] ];

It can be annoying to pull deep values from such an array because if you access an array key that doesn’t exist, PHP will throw an exception.

do_something_with( $data['one']['two']['six'] ); // throws an exception!

You could check each level of the data using array_key_exists() like this:

if ( array_key_exists( 'one', $data )  && array_key_exists( 'two', $data['one'] ) && array_key_exists( 'three', $data['one']['two'] ) ) {
  do_something_with( $data['one']['two']['three'] );
}

But that would take forever. PHP’s isset() method fortunately makes this somewhat more clean:

if ( isset( $data['one']['two']['three'] ) ) {
  do_something_with( $data['one']['two']['three'] ); 
}

But that’s still a lot to type, especially if you have to do it several times. In JavaScript-land we have some nice patterns around this sort of thing. Underscores and Lodash have the _.get() method, so I’ve re-created that in PHP here as get_deep_value().

The first argument is the array you want to search, and the second argument is an array of keys, one per level.

Now you can just write:

if ( $value = Get_Deep_Value::get_deep_value( $data, ['one', 'two', 'three'] ) {
  do_something_with( $value ); 
}

Or maybe even, if your function is ok with null values, just simply:

do_something_with( Get_Deep_Value::get_deep_value( $data, ['one', 'two', 'three'] ) );

I’m just posting the gist here for now because I’m not quite sure how to release this as a general module in the PHP ecosystem yet. If you have any helpful suggestions, please add them in the comments!