blog

Weapons of Mass Distraction

Fighting Distractions With Code

by

I fight a daily battle against distractions. I know I’m not the only one, either, because the web sites that distract me are full of articles about how I can be less distracted. (I don’t want to change too much about how my mind works—the same part of my brain that gets distracted also is amazingly creative in problem solving.)

I’ve used some tools that help some of the time: pomodoro timers, (10 + 2) * 5 timers, time trackers, a custom hosts file, closing the browser, and upbeat music are just the beginning. In this article, I want to talk about how I handle the tool of productivity and distraction called the web.

/etc/hosts

The cheapest and easiest way to selectively block websites is to prevent your computer from resolving their names. By making the sites resolve to localhost before hitting a DNS server, my browser can’t show them. I did not come up with the idea, but it is a great one. As you can see from this segment of my /etc/hosts file, I have blocked several news sites.

Software screen capture

When I want or need to view a blocked site, I have the distinct annoyance of needing to $ sudo vim /etc/hosts to comment out the line. This deters me from cheating, but it also is a pain when it comes to work-related searches that point to a blocked site.

An Idea

Recently I was pondering how inefficiently I track my time. I know that I could use any one of the many time-tracking tools available and that lots of them are really good. It occurred to me that it would be handy to have a record of my web browsing. (I could just file a FOIA request with the NSA, but that would be slow.)

Then I realized that I could have a record of my browsing. I just needed a local proxy server that would record everything. (I normally use about four different browsers at a time, so just reading the browser cache is not a useful solution.)

Local HTTP Proxy

Implementing an HTTP proxy is not a difficult task. Implementing a robust HTTP proxy that performs well is not trivial. Thankfully, Twisted has already done the hard parts for me. Here is a simple HTTP proxy server that listens on port 8080:

import twisted.internet.reactor
import twisted.web.http
import twisted.web.proxy
def start_proxy():
    proxyFactory = twisted.web.http.HTTPFactory()
    proxyFactory.protocol = twisted.web.proxy.Proxy
    twisted.internet.reactor.listenTCP(8080, proxyFactory)
def main():
    start_proxy()
    twisted.internet.reactor.run()
if '__main__' == __name__:
    main()

To make it more useful, I want it to log the URLs I visit, along with timestamps. After digging in Twisted’s twisted.web.proxy module, I’ve determined that the twisted.web.proxy.ProxyRequest class is the best location for logging the information I want to capture. Below I’ve replaced it with CustomProxyRequest which adds logging in the process method:

import logging import sys
import twisted.internet.reactor
import twisted.web.http
import twisted.web.proxy
class CustomProxyRequest(twisted.web.proxy.ProxyRequest):
    def __init__(self, channel, queued, reactor=twisted.internet.reactor):
        # twisted.web.http.Request--the ultimate base class--is an old-style
        # class, so we can't use super() here.
        twisted.web.proxy.ProxyRequest.__init__(self, channel, queued, reactor)
    def process(self):
        log = logging.getLogger('CustomProxyRequest')
        m = '{method}t{uri}'.format(method=self.method, uri=self.uri)
        log.info(m)
        twisted.web.proxy.ProxyRequest.process(self)
def configure_logging():
    formatter = logging.Formatter('%(created)ft%(message)s')
    stdhandler = logging.StreamHandler(sys.stdout)
    stdhandler.setFormatter(formatter)
    log = logging.getLogger('CustomProxyRequest')
    log.addHandler(stdhandler)
    log.setLevel(logging.INFO)
def start_proxy():
    proxyFactory = twisted.web.http.HTTPFactory()
    proxyFactory.protocol = twisted.web.proxy.Proxy
    proxyFactory.protocol.requestFactory = CustomProxyRequest
    twisted.internet.reactor.listenTCP(8080, proxyFactory)
def main():
    configure_logging()
    start_proxy()
    twisted.internet.reactor.run()if '__main__' == __name__:
    main()

At this point, the CustomProxyRequest is logging the request method and URI to standard output. (To log the response, more work is required.) This is a good first step in helping me understand my browsing habits.

In the future, I hope to modify the proxy so it actively helps me focus, possibly by injecting pages that remind me that I should be working instead of browsing. I’m not sure how best to accomplish this, whether I need to study machine learning or if a set of rules will do the trick. Suggestions are welcome.

+ more

Accurate Timing

Accurate Timing

In many tasks we need to do something at given intervals of time. The most obvious ways may not give you the best results. Time? Meh. The most basic tasks that don't have what you might call CPU-scale time requirements can be handled with the usual language and...

read more