One neat upgrade in Debian's recent 5.0.0 release1 was Squid 2.7. In this bandwidth-starved corner of the world, a caching proxy is a nice addition to a network, as it should shave at least 10% off your monthly bandwidth usage. However, the recent rise of CDNs has made many objects that should be highly cacheable, un-cacheable.
For example, a YouTube video has a static ID. The same piece of video will always have the same ID, it'll never be replaced by anything else (except a "sorry this is no longer available" notice). But it's served from one of many delivery servers. If I watch it once, it may come from
But the next time it may come from v15.cache.googlevideo.com
. And that's not all, the signature parameter is unique (to protect against hot-linking) as well as other not-static parameters.
Basically, any proxy will probably refuse to cache it (because of all the parameters) and if it did, it'd be a waste of space because the signature would ensure that no one would ever access that cached item again.
I came across a page on the squid wiki that addresses a solution to this.
Squid 2.7 introduces the concept of a storeurl_rewrite_program
which gets a chance to rewrite any URL before storing / accessing an item in the cache. Thus we could rewrite our example file to
We've normalised the URL and kept the only two parameters that matter, the video id and the itag which specifies the video quality level.
The squid wiki page I mentioned includes a sample perl script to perform this rewrite. They don't include the itag, and my perl isn't good enough to fix that without making a dog's breakfast of it, so I re-wrote it in Python. You can find it at the end of this post. Each line the rewrite program reads contains a concurrency ID, the URL to be rewritten, and some parameters. We output the concurrency ID and the URL to rewrite to.
The concurrency ID is a way to use a single script to process rewrites from different squid threads in parallel. The documentation is this is almost non-existant, but if you specify a non-zero storeurl_rewrite_concurrency
each request and response will be prepended with a numeric ID. The perl script concatenated this directly before the re-written URL, but I separate them with a space. Both seem to work. (Bad documentation sucks)
All that's left is to tell Squid to use this, and to override the caching rules on these URLs.
Done. And it seems to be working relatively well. If only I'd set this up last year when I had pesky house-mates watching youtube all day ;-)
It should of course be noted that doing this instructs your Squid Proxy to break rules.
Both override-expire
and ignore-reload
violate guarantees that the HTTP standards provide the browser and web-server about their communication with each other.
They are relatively benign changes, but illegal nonetheless.
And it goes without saying that rewriting the URLs of stored objects could cause some major breakage by assuming that different objects (with different URLs) are the same. The provided regexes seem sane enough to not assume that this won't happen, but YMMV.
I’m constantly surprised when I come across long-time Linux users who don’t know about SysRq. The Linux Magic System Request Key Hacks are a magic set of commands that you can get the Linux kernel to follow no matter what’s going on (unless it has panicked or totally deadlocked).
Why is this useful? Well, there are many situations where you can’t shut a system down properly, but you need to reboot. Examples:
lp0
possibly?)In any of those situations, grab a console keyboard, and type Alt+SysRq+s (sync), Alt+SysRq+u (unmount), wait for it to have synced, and finally Alt+SysRq+b (reboot NOW!). If you don’t have a handy keyboard attached to said machine, or are on another continent, you can
In my books, the useful SysRq commands are:
In fact, read the rest of the SysRq documentation, print it out, and tape it above your bed. Next time you reach for the reset switch on a Linux box, stop your self, type the S,U,B sequence, and watch your system come up as if nothing untoward has happened.
Update: I previously recommended U,S,B but after a bit of digging, I think S,U,B may be correct.
I’ve written about this before, but Ctrl-Alt-workspace switching key-presses nail me routinely.
Let’s go through some history:
We have Ctrl-Alt-Delete, the “three-fingered-salute”, meaning reboot, right? That combination was designed to NEVER be pressed by accident. And it never used to be.
The X guys needed a kill-X key-press, as things can sometimes get broken in X. So they chose Ctrl-Alt-Backspace, which is also a pretty sensible combination. It’s very similar to Ctrl-Alt-Delete, so we remember it, and backspace has milder connotations than delete, so we understand it to mean that it’ll only kill a part of the system.
X also has some other Ctl-Alt- shortcuts. Some of these are also suitably obscure, i.e. NumPad+ and NumPad-. Others like Ctrl-Alt-F1 mean change to virtual console 1. That one might do by accident, if you are an old WordPerfect user, but should be safe enough otherwise. They were designed to look like big brothers of (and even work as) standard VT-changing behaviour.
For changing workspaces, Alt-F1 style key-presses were used, mimicing VT-changing key-presses. This is great for *nix users, but people coming from Windows expect Alt-F4 to close a program, not take them to workspace 4.
So GNOME came along, and decided that instead, they’d use Ctrl-Alt-Arrow key-presses to change workspace. That’s fine, but it’s a pretty common action, so I’m often holding down Ctrl-Alt without even thinking about it. If I start editing something and press delete/backspace, before I’ve released Ctrl-Alt, boom! And I run screaming and write a blog post.
Now, I know that Ctl-Alt-{Delete,Backspace} can be disabled (even if the latter is a little tricky to do), but I’d really like to change them. I like to be able to kill X without using another machine and ssh, I just don’t like this to happen by accident. And no, the solution isn’t for me to change my workspace-changing keys, because this problem must affect every GNOME user, not just me.
Dangerous key-presses should be really unlikely key-presses. Alt-SysRq- key-presses are good in this regard, they’ll always be unlikely. (Oh, and they are insanely useful.)
My post on split-routing on OpenWRT has been incredibly popular, and led to many people implementing split-routing, whether or not they had OpenWRT. While it's fun to have an exercise as a reader, it led to me having to help lots of newbies through porting that setup to a Debian / Ubuntu environment. To save myself some time, here's how I do it on Debian:
Background, especially for non-South Africa readers: Bandwidth in South Africa is ridiculously expensive, especially International bandwidth. The point of this exercise is that we can buy "local-only" DSL accounts which only connect to South African networks. E.g. I have an account that gives me 30GB of local traffic / month, for the same cost as 2.5GB of International traffic account. Normally you'd change your username and password on your router to switch account when you wanted to do something like an Debian apt-upgrade, but that's irritating. There's no reason why you can't have a Linux-based router concurrently connected to both accounts via the same ADSL line.
Firstly, we have a DSL modem. Doesn't matter what it is, it just has to support bridged mode. If it won't work without a DSL account, you can use the Telkom guest account. My recommendation for a modem is to buy a Telkom-branded Billion modem (because Telkom sells everything with really big chunky, well-surge-protected power supplies).
For the sake of this example, we have the modem (IP 10.0.0.2/24) plugged into eth0 on our server, which is running Debian or Ubuntu, doesn't really matter much - personal preference. The modem has DHCP turned off, and we have our PCs on the same ethernet segment as the modem. Obviously this is all trivial to change.
You need these packages installed:
You need ppp interfaces for your providers. I created /etc/ppp/peers/intl-dsl
:
/etc/ppp/peer/local-dsl
:
unit 1
makes a connection always bind to "ppp1". Everything else is pretty standard. Note that only the international connection forces a default route.
To /etc/ppp/pap-secrets
I added my username and password combinations:
You need custom iproute2 routing tables for each interface, for the source routing. This will ensure that incoming connections get responded to out of the correct interface. As your provider only lets you send packets from your assigned IP address, you can't send packets with the international address out of the local interface. We get around that with multiple routing tables. Add these lines to /etc/iproute2/rt_tables
:
Now for some magic. I create /etc/ppp/ip-up.d/20routing
to set up routes when a connection comes up:
That script loads routes from /etc/network/routes-intl-dsl
and /etc/network/routes-local-dsl
. It also sets up source routing so that incoming connections work as expected.
Now, we need those route files to exist and contain something useful. Create the script /etc/cron.daily/za-routes
(and make it executable):
It downloads the routes file from cocooncrash's site (he gets them from local-route-server.is.co.za
, aggregates them, and publishes every 6 hours). Run it now to seed that file.
Now some International-only routes. I use IS local DSL, so SAIX DNS queries should go through the SAIX connection even though the servers are local to ZA.
My /etc/network/routes-intl-dsl
contains SAIX DNS servers and proxies:
Now we can tell /etc/network/interfaces
about our connections so that they can get brought up automatically on bootup:
For DNS, I use dnsmasq, hardcoded to point to IS & SAIX upstreams. My machine's /etc/resolv.conf
just points to this dnsmasq.
So something like /etc/resolv.conf
:
/etc/dnsmasq.conf
:
If you haven't already, you'll need to turn on ip_forward. Add the following to /etc/sysctl.conf
and then run sudo sysctl -p
:
Finally, you'll need masquerading set up in your firewall. Here is a trivial example firewall, put it in /etc/network/if-up.d/firewall
and make it executable. You should probably change it to suit your needs or use something else, but this should work:
I had an interesting discussion with “bonnyrsa” in #ubuntu-za today. He’d re-arranged his partitions with gparted, and copied and pasted his / partition, so that he could move it to the end of the disk.
However this meant that he now had two partitions with the same UUID. While you can imagine that this is the correct result of a copy & paste operation, it now means that your universally unique ID is totally non-unique. Not in your PC, and no even on it’s home drive.
Ubuntu mounts by UUID, so now how do we know which partition is being mounted?
However neither were correct.
Mounting /dev/sda4 (ro) produced “/dev/sda4 already mounted or /mnt busy”.
Aha, so we must be running from /dev/sda4.
/dev/sda2 mounted fine, but then wouldn’t unmount: “it seems /dev/sda2 is mounted multiple times”.
Aaaargh!
I got him to reboot, change /dev/sda2s UUID, and reboot again (sucks). Then everything was better.
This shouldn’t have happened. Non-unique UUIDs is a really crap situation to be in. It brings out bugs in all sorts of unexpected places. I think parted should (by default) change the UUID of a copied partition (although if you are copying an entire disk, it shouldn’t).
I’ve filed a bug on Launchpad, let’s see if anyone bites.
PS: All UUIDs in this post have been changed to protect the identity of innocent Ubuntu systems (who aren’t expecting a sudden attack of non-uniqueness).