Posts

To document some thoughts and learnings.

2018

Watching Upstream Binaries with Concourse on December 2, 2018
When building software packages, it’s easy to accumulate dependencies on dozens of other, upstream software components. When building the first version of something, it’s easy to blindly download the source of the latest version off the packages’ website. However, once you’re past prototypes and need to deal with auditing or maintenance, it becomes important to have some [automated] processes in place. I have written several posts over the years around experiments for automatically upgrading components to avoid repetitive work.

Switching from Jekyll to Hugo on November 23, 2018
It has been a while since I spent any time on my personal website here, but, recently, I have a few projects and ideas looking for a place to be recorded. As part of revisiting the site, I decided it might be a good opportunity to switch from using Jekyll as the static site generator to using Hugo. Here are some of the motivations and small learnings from that process.

2017

Documenting Blobs with Metalink Files on October 9, 2017
There are many blobs around the web, with different organizations and teams publishing artifacts through different channels and with varying security. Often a single project will have many dependencies from multiple different sources, and developers need to know specifics about where to download blobs and how to verify them. I started looking for a solution to help unify the way I was both consuming and sharing blobs across my own projects.

2016

Self-Upgrading Packages in BOSH Releases, Part 2 on October 21, 2016
Last year I wrote a post about how the process of updating BOSH release blobs could be better automated. The post relied on some scripts which could be executed to check and download new versions of blobs. The scripts were useful, but they still required manual execution and then testing to verify compatibility. My latest evolution of the idea further automates this with Concourse to check for new versions, download new blobs, build test releases, and then send pull requests for successful upgrades.

Data Processing with Concourse on October 19, 2016
Recently I needed to focus on a project that regularly processed datasets with typical extract, transform, and load stages. Historically it was using Amazon SQS to queue up the tasks for each stage, some supervisor-managed processes to work off the queue, and Amazon S3 to store the results of each stage. The original implementation was struggling because it was inefficient, difficult to detect problems, and more difficult to replay a stage whenever unexpected, bad data showed up.

Composing Configurations with JQ on April 26, 2016
When managing configurations for services there are often variables which need to be changed depending on the environment or use case. Different tools deal with that sort of parameterization slightly differently. For example… AWS CloudFormation - stack templates have a high level Parameter type which can contain user-supplied values. There are built-in functions to concatenate and do some other primitive transformations. BOSH - manifests are actually an ERB template, allowing for dynamic inclusion of environment variables, file contents, settings from configuration files, or complicated logic.

Writing a PHP Client for the Ravelry API on January 21, 2016
Ravelry is, in their own words, “a place for knitters, crocheters, designers, spinners, weavers and dyers to keep track of their yarn, tools, project and pattern information, and look to others for ideas and inspiration.” It’s no wonder so many of TLE’s customers are also “Ravelers”. Several years ago Ravelry created an API so developers could write apps and create integrations that users would love. Classifying myself as more a developer than a knitter, the API piqued my interest.

Experimenting with BOSH Links and Consul on January 11, 2016
With BOSH, I use deployments to segment various services. For example, TLE has several services like web and database servers, WordPress blogs, the main e-commerce application, statistics, and internal services. Many of them are interconnected in some way. Historically I’ve used a combination of hard-coded IP addresses in the deployment properties and dynamic service discovery with consul. With a small bit of tweaking and an extra pre-parser, I’m now able to emulate much of the proposed links features, but from a more dynamic, distributed perspective.

2015

Tempore limites: BOSH Veneer on November 12, 2015
For all the low-level handling of things, BOSH is a good tool for system administration. But when it comes to configuring everything, I think it leaves something to be desired for the average Joe. Opening my text editor, making changes to the YAML, copying and pasting security groups from AWS Console, git diffing to make sure I did what I think, git commiting in case things go bad, bosh deploying to make it so… it can become quite the process.

Pruning Blobs from BOSH Releases on August 6, 2015
Over time, as blobs are continually added to BOSH releases, the files can start consuming lots of disk space. Blobs are frequently abandoned because newer versions replace them, or sometimes the original packages referencing them are removed. Unfortunately, freeing the disk space isn’t as simple as rm blobs/elasticsearch-1.5.2.tar.gz because BOSH keeps track of blobs in the config/blobs.yml file and uses symlinks to cached copies. To help keep a lean workspace, I remove references to blobs which are no longer needed in my release.

Self-Upgrading Packages in BOSH Releases on August 3, 2015
Outside of BOSH world, package management is often handled by tools like yum and apt. With those tools, you’re able to run trivial commands like yum info apache2 to check the available versions or yum update apache2 to upgrade to the latest version. It’s even possible to automatically apply updates via cron job. With BOSH, it’s not nearly so easy since you must monitor upstream releases, manually downloading the sources before moving on to testing and deploying.

Using nginx to Reverse Proxy and Cache S3 Objects on June 20, 2015
My most recent project for TLE has been focused on making the infrastructure much more “cloud-friendly” and resilient to failures. One step in the project was going to require that more than one version might be running at a given time (typically just while a new version is still being rolled out to servers). The application itself doesn’t have an issue with that sort of transition period, however, the way we were handling static assets (like stylesheets, scripts, and images) was going to cause problems.

New BOSH Release for OpenVPN on June 3, 2015
I’m a big fan of OpenVPN - both for personal and professional VPNs. Seeing as how I’ve been deploying more things with BOSH lately, an OpenVPN release seemed like a good little project. I started one about nine months ago and have been using development releases ever since, but last week I went ahead and created a “final” release of it. There is only a single job (openvpn) and the properties are well documented.

Parsing Microdata in PHP on May 1, 2015
A couple years ago I wrote about how I was adding microdata to The Loopy Ewe website to annotate things like products, brands, and contact details. I later wrote about how the internal search engine depended on that microdata for search results. During development and the initial release I was using some basic XPath queries, but as time passed the implementation became more fragile and incomplete. Since then, the parser has gone through several refactorings and this week I was able to extract it into a separate library that I can open source.

Sending Work from a Web Application to Desktop Applications on February 21, 2015
I prefer working on the web application side of things, but there are frequently tasks that need to be automated outside the context of a browser and server. For TLE, there’s a physical shop where inventory, order, and shipping tasks need to happen, and those tasks revolve around web-based systems of one form or another. To help unify and simplify things for the staff (aka elves), I’ve been connecting scripts on the workstations with internal web applications via queues in the cloud.

2014

Logging logging and Finding Bottlenecks on November 14, 2014
I’ve been doing quite a bit of work with the ELK stack (elasticsearch, logstash, kibana) through the logsearch project. As we continued to scale the stack to handle more logs and log types, we started having difficulty identifying where some of the bottlenecks were occuring. Our most noticeable issue was that occasionally the load on our parsers would spike for sustained periods, causing our queue to get backed up and real-time processing to get significantly delayed.

Colorado Aspens on September 28, 2014
Colorado is usually a beautiful place, but especially in Autumn when the Aspens are turning… {% include gallery_list640w.html gallery=‘2014-colorado-aspens’ %}

Simplifying My BOSH-related Workflows on September 17, 2014
Over the last nine months I’ve been getting into BOSH quite a bit. Historically, I’ve been reluctant to invest in BOSH because I don’t entirely agree with its architecture and steep learning curve. BOSH describes itself with… BOSH installs and updates software packages on large numbers of VMs over many IaaS providers with the absolute minimum of configuration changes. BOSH orchestrates initial deployments and ongoing updates that are:

Search by Color with Elasticsearch on April 24, 2014
A year ago when I updated the TLE website I dropped the “search by color” functionality. Originally, all the colors were indexed into a database table and the frontend generated some complex queries to support specific and multi-color searching. On occasion, it caused some database bottlenecks during peak loads and with some particularly complex color combinations. The color search was also a completely separate interface from searching other product attributes and availability.

Photo Galleries for Jekyll on April 8, 2014
I had a trip to London and Iceland several weeks ago, and I wanted to share some of those photos with people. In the past I’ve put those sorts of photo galleries on Facebook, but some friends don’t have accounts there and I figured I could/should just keep my photos with my other personal stuff here. Unlike WordPress, Jekyll doesn’t really have a concept of photo galleries, and since Jekyll is a static site generator it makes things a little more difficult.

Distributed Docker Containers on February 28, 2014
One thing I’ve been working with lately is Docker. You’ve probably seen it referenced in various tech articles lately as the next greatest thing for cloud computing. Docker runs “containers” from base “images” which essentially allow running many lightweight virtual machines on any recent, Linux-based system. Internally, the magic behind it is lxc, although Docker adds a lot more magic to improve and make it more usable. For a long time now I’ve used virtual machines for development - it allows me to better simulate how software runs out on production servers.

Barcoding Inventory with QR Codes on January 13, 2014
Most decently-sized stores will have barcodes on their products. For the store, it makes the checkout process extremely easy and accurate. For the consumer, barcodes might be useful with a phone app to scan them. I needed to make the inventory scannable at the shop, and I really wanted to do it in a more meaningful way than 1D barcodes could support. Barcodes: 1D vs 2D There are two different kinds of barcodes: 1 dimensional and 2 dimensional.

2013

The Basics of a Custom Search Engine on June 1, 2013
One of the most useful features of a website is the ability to search. The Loopy Ewe has had some form of faceted product search for a long time, but it has never had the ability to quickly find regular pages, categories, brands, blog posts and the like. Google seems to lead in offering custom search products with both Custom Search Engine and Site Search, but they’re either branded or cost a bit of money.

ti-debug: For Debugging Server Code in the Browser on May 16, 2013
I find that I am rarely using full IDEs to write code (e.g. Eclipse, Komodo, NetBeans, Zend Studio). They tend to be a bit sluggish when working with larger projects, so I favor simplistic editors like Coda or the always-faithful vim. One thing I miss about using full-featured IDEs is their debugging capabilities. They usually have convenient debugger interfaces that allow stepping through runtime code to investigate bugs. About a year ago I started a project called ti-debug with the goal of being able to debug my server-side code (like PHP) through WebKit’s developer tools interface.

Structured Data with schema.org on May 13, 2013
Good website content is important so people can learn and interact, but robots are the ones interpreting content to figure out if the content is actually useful to people. With the new website I wanted to be sure I was using standards and metadata so the content could be programmatically useful. I chose to use the markup from schema.org due to its fairly comprehensive data types and broad adoption by search engines.

Embeddable and Context-Aware Web Pages on May 7, 2013
In my symfony website applications I frequently make multiple subrequests to reuse content from other controllers. For simple, non-dynamic content this is trivial, but when arguments can change data or when the browser may want to update those subrequests things start to get complicated. Usually it requires tying the logic of the subrequest controller in the main request controller (e.g. knowing that the q argument needs to be passed to the template, and then making sure the template passes it in the subrequest).

New Website for The Loopy Ewe on April 27, 2013
I’ve spent the past several months working on some website changes for The Loopy Ewe. On Thursday I was able to push many of those frontend changes out. I thought I’d briefly discuss some of those changes here. Before and After First off, it’s fun to show before and after screenshots of many key areas… Home Page So the home page is one of the first welcome pages to new visitors.

Bank Card Readers for Web Applications on March 23, 2013
I made a web-based point of sale for The Loopy Ewe, but it needed an easier way to accept credit cards aside from manually typing in the credit card details. To help with that, we got a keyboard-emulating USB magnetic card reader and I wrote a parser for the card data and convert it to an object. It is fairly simple to hook up to a form and enable a card to be scanned while the user is focused in the name or number fields…

Using HTML Headers with wkhtmltopdf on March 15, 2013
Preparing for my job search, I really wanted to somehow reuse the content from my about page for my résumé instead of trying to also maintain the information in a Word/Google Drive file. Mac OS X has the convenient capability to convert any print to a PDF which is helpful in creating a general print-specific stylesheet for browsers, but it still had a few drawbacks. One of those drawbacks is headers - I expect to see them on even the simplest professional documents.

Comparing PHP Application Definitions on March 7, 2013
While working to update a PHP project, I thought it’d be helpful if I could systematically qualify significant code changes between versions. I could weed through a massive line diff, but that’s costly if many changes aren’t ultimately affecting my API dependencies. Typically I only care about how interfaces and classes change in their usage of methods, method arguments, variables, and scope. I did a bit of research on the idea and found several different questions, a few referenced products, and a short article on the idea.

Path-based tmpfile in PHP on March 5, 2013
PHP has the tmpfile function for creating a file handle which will automatically be destroyed when it is closed or when the script ends. PHP also has the tempnam function which takes care of creating the file and returning the path, but doesn’t automatically destroy the file. To get the best of both worlds (temp file + auto-destroy), I have found this useful: <?php function tmpfilepath() { $path = stream_get_meta_data(tmpfile())['uri']; register_shutdown_function( function () use ($path) { unlink($path); } ); return $path; }

A Generic Storage Interface on March 1, 2013
Websites often have a lot of different assets and files for the various areas of a website - content management systems, photo galleries, e-commerce product photos, etc. As a site grows, so does storage demand and backup requirements, and as storage demands grow it typically becomes necessary to distribute those files across multiple servers or services. One method for managing disparate file systems is to use custom PHP stream wrappers and configurable paths; but some extensions don’t yet support custom wrappers for file access.

Using Facter in Ant Scripts on February 19, 2013
After using puppet for a while I have become use to some of the facts that facter automatically provides. When working with ant build scripts, I started wishing I didn’t have to generate similar facts myself through various exec calls. One Fact Instead of fragile lookups like… <exec executable="/bin/bash" outputproperty="lookup.eth0"> <arg value="-c" /> <arg value="/sbin/ifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' | awk '{print $1}'" /> </exec> I can simplify it with…

Automating Backups to the Cloud on February 8, 2013
Backups are extremely important and I’ve been experimenting with a few different methods. My concerns are always focused on maintaining data integrity, security, and availability. One of my current methods involves using asymmetric keys for secure storage and object versioning to ensure backup data can’t undesirably be overwritten. Encryption Keys For encryption and decryption I’m using asymmetric keys via gpg. This way, any server can generate and encrypt the data, but only administrators who have the private key could actually decrypt the data.

Scripting Endicia to Purchase Postage on January 28, 2013
We currently use Endicia for Mac for postage processing at Loopy. We rarely use the UI since I’ve scripted most of it, but one annoyance had been to regularly open it up and add postage since it doesn’t reload automatically. If we happen to forget, it ends up blocking things until we notice. I finally got around to scripting that, too. Scripted In real life, whenever the balance gets too low it throws up an alert and you need to click through a few menus, select a purchase amount, and confirm the selection before the application will continue.

OpenGrok CLI on January 21, 2013
One tool that makes my life as a software developer easier is OpenGrok - it lets me quickly find application code and it knows more context than a simple grep. It has a built-in web interface, but sometimes I want to work with search results from the command line (particularly for automated tasks). Since I couldn’t find an API, I created a command to load and parse results using symfony/console and xpath.

Terminating Gearman Workers in PHP on January 14, 2013
I use Gearman as a queue/job server. An application gives it a job to do, and Gearman passes the job along to a worker that can finish it. Handling both synchronous and asynchronous tasks, the workers can be running anywhere – the same server as Gearman, a server across the country, or even a workstation at a local office. This makes things a bit complicated when it comes time to push out software or configuration changes to workers.

Secure Git Repositories on January 7, 2013
I use private repositories on GitHub, but I still don’t feel quite comfortable pushing sensitive data like passwords, keys, and account information. Typically that information ends up just sitting on my local machine or in my head ready for me to pull up as needed. It would be much better if that information was a bit more fault tolerant and, even better, if I could follow similar workflows as the rest of my application code.