Photo of receipt by Michael Walter on Unsplash

Sales Receipt File Format


If Costco keeps a record in their database of every item I’ve ever bought and when, why can’t I keep my own database of that information? For only my own purchases, of course, not yours. Mainly it’s because I don’t have an underage, underpaid offspring to perform all that tedious data entry, although that’s going to be an expensive and unreliable solution to this problem anyway.

When I buy something, I want to receive a digital receipt instead of a paper receipt. Further, I want everything on the receipt delineated in its own field, so that I can extract things like barcode numbers and how much sales tax I paid on each item. That would be cool, no? At least in a statisticsy data-analysis kind of way? Remember all those times I wrote a few SQL queries to pull up fun conclusions you didn’t know you could get out of your database?

I’ve mentioned this concept in passing a couple times recently, and each time for some reason privacy and encryption become immediate concerns that block out all other considerations. No, stop. I’m not saying there should be a central repository for all this data. I’m not saying it should be kept on some 3rd party website from which you download it. No, no, all I propose is that we replace that little piece of paper with a computer file. It would still be handed from the merchant to me, in person, with nobody else involved. This isn’t meant to replace any existing electronic payment processing systems; it’s just an additional presentation of the current jumble of figures you get on a receipt, but in a consistent, neat, drillable package.

The tricky part, as far as I’m concerned, is establishing the hardware to allow this. Worst case scenario is that I carry a USB thumb drive on my keychain that I can plug into the cashier’s USB hub at time of purchase. A better way might be to dock my iPhone. But in any case I ideally wouldn’t need an "account" or an "identity" or a "public key" or an "email address" or anything of the sort because all that’s happening is I’m receiving a virtual piece of paper, a file being copied to me containing the same data that would have been printed, and nothing more. The privacy is implied by its simplicity.

I’ll admit you might have to be the son of an accountant to have the sort of feelings that make the potential of having all that data at hand interesting. For everyone else, maybe this whole idea serves no discernible benefit. There’s also the question of paying for building something that effectively only serves "consumers", not explicitly a business. It’s the sort of thing a company would only offer as a convenience.

However, it’s also the sort of thing that could be widely adopted or copied very quickly once the right company makes it available to the public. That’s why the file format would need to be documented publicly and very clearly, and hopefully with some flexibility but not too many obnoxious features. You’d want it to be super easy for people to integrate into existing accounting software, etcetera. The less proprietary it is, the more useful and thus inherently valuable it is.

It should be a text format. It should be as extremely simple, maybe CSV or INI or JSON (not XML). If there are field labels, they should be whole words, not codes, as human-readable as possible. I should be able to tweak it in Notepad (apologies to everyone who knows that Notepad is the worst text editor). Stuff that’s required versus optional should be very clearly defined, and where there is flexibility, best practices should be presented up front. Each file should describe a single transaction, or there should be a clear way to aggregate transactions into a file. There should be a validator available up front.

Maybe I’ll draw up some examples of what this file format could look like if we’d only get past this initial resistance to the idea. That means this is your time to resist, if you’re so inclined.

+ more

Accurate Timing

Accurate Timing

In many tasks we need to do something at given intervals of time. The most obvious ways may not give you the best results. Time? Meh. The most basic tasks that don't have what you might call CPU-scale time requirements can be handled with the usual language and...

read more