Fighting Distractions With Code

/etc/hosts

The cheapest and easiest way to selectively block websites is to prevent your computer from resolving their names. By making the sites resolve to localhost before hitting a DNS server, my browser can’t show them. I did not come up with the idea, but it is a great one. As you can see from this segment of my /etc/hosts file, I have blocked several news sites.

When I want or need to view a blocked site, I have the distinct annoyance of needing to $ sudo vim /etc/hosts to comment out the line. This deters me from cheating, but it also is a pain when it comes to work-related searches that point to a blocked site.

An Idea

Recently I was pondering how inefficiently I track my time. I know that I could use any one of the many time-tracking tools available and that lots of them are really good. It occurred to me that it would be handy to have a record of my web browsing. (I could just file a FOIA request with the NSA, but that would be slow.)

Then I realized that I could have a record of my browsing. I just needed a local proxy server that would record everything. (I normally use about four different browsers at a time, so just reading the browser cache is not a useful solution.)

Local HTTP Proxy

Implementing an HTTP proxy is not a difficult task. Implementing a robust HTTP proxy that performs well is not trivial. Thankfully, Twisted has already done the hard parts for me. Here is a simple HTTP proxy server that listens on port 8080:

import twisted.internet.reactor import twisted.web.http import twisted.web.proxy def start_proxy(): proxyFactory = twisted.web.http.HTTPFactory() proxyFactory.protocol = twisted.web.proxy.Proxy twisted.internet.reactor.listenTCP(8080, proxyFactory) def main(): start_proxy() twisted.internet.reactor.run() if '__main__' == __name__: main()

To make it more useful, I want it to log the URLs I visit, along with timestamps. After digging in Twisted’s twisted.web.proxy module, I’ve determined that the twisted.web.proxy.ProxyRequest class is the best location for logging the information I want to capture. Below I’ve replaced it with CustomProxyRequest which adds logging in the process method:

import logging import sys import twisted.internet.reactor import twisted.web.http import twisted.web.proxy class CustomProxyRequest(twisted.web.proxy.ProxyRequest): def __init__(self, channel, queued, reactor=twisted.internet.reactor): # twisted.web.http.Request--the ultimate base class--is an old-style # class, so we can't use super() here. twisted.web.proxy.ProxyRequest.__init__(self, channel, queued, reactor) def process(self): log = logging.getLogger('CustomProxyRequest') m = '{method}t{uri}'.format(method=self.method, uri=self.uri) log.info(m) twisted.web.proxy.ProxyRequest.process(self) def configure_logging(): formatter = logging.Formatter('%(created)ft%(message)s') stdhandler = logging.StreamHandler(sys.stdout) stdhandler.setFormatter(formatter) log = logging.getLogger('CustomProxyRequest') log.addHandler(stdhandler) log.setLevel(logging.INFO) def start_proxy(): proxyFactory = twisted.web.http.HTTPFactory() proxyFactory.protocol = twisted.web.proxy.Proxy proxyFactory.protocol.requestFactory = CustomProxyRequest twisted.internet.reactor.listenTCP(8080, proxyFactory) def main(): configure_logging() start_proxy() twisted.internet.reactor.run()if '__main__' == __name__: main()

At this point, the CustomProxyRequest is logging the request method and URI to standard output. (To log the response, more work is required.) This is a good first step in helping me understand my browsing habits.

In the future, I hope to modify the proxy so it actively helps me focus, possibly by injecting pages that remind me that I should be working instead of browsing. I’m not sure how best to accomplish this, whether I need to study machine learning or if a set of rules will do the trick. Suggestions are welcome.

Spot the Vulnerability: Loops and Terminating Conditions

by Adam Singleton | Jan 7, 2022 | Developer Blog, Home Display

Spot the Vulnerability: Loops and Terminating Conditions In memory-unsafe languages like C, special care must be taken when copying untrusted data, particularly when copying it to another buffer. In this post, we\'ll spot and mitigate a past vulnerability in Linux\'s...

Accurate Timing

by Adam Singleton | Sep 24, 2021 | Developer Blog, Home Display

In many tasks we need to do something at given intervals of time. The most obvious ways may not give you the best results. Time? Meh. The most basic tasks that don't have what you might call CPU-scale time requirements can be handled with the usual language and...

Exploring Dependent Types in Idris

by Adam Singleton | Aug 27, 2021 | Developer Blog, Home Display

When I'm not coding the "impossible" at Art+Logic, I take a lot of interest in new programming technologies and paradigms; even if they're not yet viable for use in production, there can often be takeaways for improving your everyday code. My current...

Fighting Distractions With Code

/etc/hosts

An Idea

Local HTTP Proxy

Recent posts

Categories

Spot the Vulnerability: Loops and Terminating Conditions

Accurate Timing

Exploring Dependent Types in Idris

Services

Our Content

Innovation Zones

Get in Touch