Ben Laurie - Crypto Everywhere
Recent events, and a post to the OpenID list got me thinking.
Apparently rfc2817 allows an http url tp be used for https security.
Given that Apache seems to have that implemented [1] and that the
openid url is mostly used for server to server communication, would
this be a way out of the http/https problem?I know that none of the browsers support it, but I suppose that if the
client does not support this protocol, the server can redirect to the
https url? This seems like it could be easier to implement that XRI .Disclaimer: I don’t know much about rfc2817
Henry
[1] http://www.mail-archive.com/dev-tech-crypto@lists.mozilla.org/msg00251.html
The core issue is that HTTPS is used to establish end-to-end security, meaning, in particular, authentication and secrecy. If the MitM can disable the upgrade to HTTPS then he defeats this aim. The fact that the server declines to serve an HTTP page is irrelevant: it is the phisher that will be serving the HTTP page, and he will have no such compunction.
The traditional fix is to have the client require HTTPS, which the MitM is powerless to interfere with. Upgrades would work fine if the HTTPS protocol said “connect on port 80, ask for an upgrade, and if you don’t get it, fail”, however as it is upgrades work at the behest of the server. And therefore don’t work.
Of course, the client “requires” HTTPS because there was a link that had a scheme of “https”. But why did was that link followed? Because there was an earlier page with a trusted link (we hope) that was followed. (Note that this argument applies to both users clicking links and OpenID servers following metadata).
If that page was served over HTTP, then we are screwed, obviously (bearing in mind DNS cache attacks and weak PRNGs).
This leads to the inescapable conclusion that we should serve everything over HTTPS (or other secure channels).
Why don’t we? Cost. It takes far more tin to serve HTTPS than HTTP. Even really serious modern processors can only handle a few thousand new SSL sessions per second. New plaintext sessions can be dealt with in their tens of thousands.
Perhaps we should focus on this problem: we need cheap end-to-end encryption. HTTPS solves this problem partially through session caching, but it can’t easily be shared across protocols, and sessions typically last on the order of five minutes, an insanely conservative figure.
What we need is something like HTTPS, shareable across protocols, with caches that last at least hours, maybe days. And, for sites we have a particular affinity with, an SSH-like pairing protocol (with less public key crypto - i.e. more session sharing).
Having rehearsed this discussion many times, I know the next objection will be DoS on the servers: a bad guy can require the server to spend its life doing PK operations by pretending he has never connected before. Fine, relegate PK operations to the slow queue. Regular users will not be inconvenienced: they already have a session key. Legitimate new users will have to wait a little longer for initial load. Oh well.
Ben Laurie - Keyczar!
When I joined Google over two years ago I was asked to find a small project to get used to the way development is done there. The project I chose was one that some colleagues had been thinking about, a key management library. I soon realised that unless the library also handled the crypto it was punting on the hard problem, so I extended it to do crypto and to handle key rotation and algorithm changes transparently to the user of the library.
About nine months later I handed over my “starter project” to Steve Weis, who has worked on it ever since. For a long time we’ve talked about releasing an open source version, and I’m pleased to say that Steve and intern Arkajit Dey did just that, earlier this week: Keyczar.
Keyczar is an open source cryptographic toolkit designed to make it easier and safer for developers to use cryptography in their applications. Keyczar supports authentication and encryption with both symmetric and asymmetric keys. Some features of Keyczar include:
- A simple API
- Key rotation and versioning
- Safe default algorithms, modes, and key lengths
- Automated generation of initialization vectors and ciphertext signatures
When we say simple, by the way, the code for loading a keyset and encrypting some plaintext is just two lines. Likewise for decryption. And the user doesn’t need to know anything about algorithms or modes.
Great work, guys! I look forward to the “real” version (C++, of course!).
Ben Laurie - Call Me Nostradamus!
Looking for links for the previous article on OpenID, I came across this post, from May 2007.
Sun’s House of Cards?
Sun have a plan. In short, they’re going to have an OpenID provider which authenticates Sun employees only.
That is, so long as you trust your DNS. Or, in other words, if you aren’t using any untrusted networks. How often does that happen?
And in the comments we find
Well, obviously it all has to run over TLS to be useful. Which should address those issues, right?
Comment by Tim Bray — 8 May 2007 @ 22:43 |Edit This
“Obviously”. Yes, that’s obvious to you and me, but really you need to write down the rules.
Plus, of course, X.509 certs haven’t proved to be the most invulnerable things in the world.
Comment by Ben — 10 May 2007 @ 8:10 |Edit This
Now, if that isn’t prophetic, I don’t know what is.
Ben Laurie - NYT Doesn’t Quite Get It, Hilarity From OpenID
The New York Times’ Randy Stross has a piece about passwords and what a bad idea they are (sorry, behind a loginwall). So far, so good (and I’ll admit to bias here: I was interviewed for this piece, and whilst there’s no attribution, what I was saying is clearly reflected in the article), but Stross weirdly focuses on OpenID as the continuing cause of our password woes, because, he says, it is blocking the deployment of information cards, which will save us all.
Now, I am no fan of OpenID, but Stross is dead wrong here. OpenID says nothing about how you log in. It is not OpenID’s fault that the login is generally done with a password - that blame we must all accept collectively.
And whilst I firmly believe that the only way out of this mess is strong authentication, information cards are hardly the be-all and end-all of that game. They certainly have a way to go in usability before they’re going to be taking the world by storm. Don’t blame OpenID for that.
In the meantime, Scott Kveton, chair of the OpenID Foundation board, reacts:
The OpenID community has identified two key issues it needs to address in 2008 that Randy mentioned in his column; security and usability.
I just have to giggle. I mean, apart from those two minor issues, OpenID is pretty good, right? He forgot to mention privacy, though.
Ben Laurie - OpenID/Debian PRNG/DNS Cache Poisoning
Where “P” stands for “Predictable”.
Richard Clayton and I today released a security advisory showing how three independent vulnerabilities combine to make a rather scary mess, mitigated only by the fact that no-one protects anything very valuable with OpenID anyway. But just think how much worse it could have been (on which I shall write more soon)!
Ben Laurie - Is Your DNS Really Safe?
Ever since the recent DNS alert people have been testing their DNS servers with various cute things that measure how many source ports you use, and how “random” they are. Not forgetting the command line versions, of course
dig +short porttest.dns-oarc.net TXT
dig +short txidtest.dns-oarc.net TXT
which yield output along the lines of
"aaa.bbb.ccc.ddd is GREAT: 27 queries in 12.7 seconds from 27 ports with std dev 15253"
But just how GREAT is that, really? Well, we don’t know. Why? Because there isn’t actually a way to test for randomness. Your DNS resolver could be using some easily predicted random number generator like, say, a linear congruential one, as is common in the rand() library function, but DNS-OARC would still say it was GREAT. Believe them when they say it isn’t GREAT, though! Non-randomness we can test for.
So, how do you tell? The only way to know for sure is to review the code (or the silicon, see below). If someone tells you “don’t worry, we did statistical checks and it’s random” then make sure you’re holding on to your wallet - he’ll be selling you a bridge next.
But, you may say, we already know all the major caching resolvers have been patched and use decent randomness, so why is this an issue?
It is an issue because of NAT. If your resolver lives behind NAT (which is probably way more common since this alert, as many people’s reactions [mine included] was to stop using their ISP’s nameservers and stand up their own to resolve directly for them) and the NAT is doing source port translation (quite likely), then you are relying on the NAT gateway to provide your randomness. But random ports are not the best strategy for NAT. They want to avoid re-using ports too soon, so they tend to use an LRU queue instead. Pretty clearly an LRU queue can be probed and manipulated into predictability.
So, if your NAT vendor is telling you not to worry, because the statistics say they are “random”, then I would start worrying a lot: your NAT vendor doesn’t understand the problem. It’s also pretty unhelpful for the various testers out there not to mention this issue, I must say.
Incidentally, I’m curious how much this has impacted the DNS infrastructure in terms of traffic - anyone out there got some statistics?
Oh, and I should say that number of ports and standard deviation are not a GREAT way to test for “randomness”. For example, the sequence 1000, 2000, …, 27000 has 27 ports and a standard deviation of over 7500, which looks pretty GREAT to me. But not very “random”.
Ben Laurie - Why Not W3C or IETF?
Ralf Bendrath asks what’s wrong with the W3C and the IETF that the OWF is trying to solve? So, to be very brief…
The W3C is a pay-to-play cartel that increasingly gets nothing done. Open source developers can’t even participate, as a rule. It also has an IPR policy that’s just as crap as everything else we’re trying not to emulate. So, not a realistic alternative.
The IETF is much better, but its main problem is that it has no IPR policy at all, other than “tell us what you know”. In practice this often works out OK, but there have been some notable instances where the outcome was pretty amazingly ungood, such as RSA’s stranglehold over SSL and TLS for years - a position Certicom are now trying to emulate with ECC, also via the IETF.
A more minor objection to the IETF that I hope the OWF will solve similarly to the ASF is that it is actually too inclusive. Anyone is allowed to join a working group and have as much say as anyone else. This means that any fool with time on their hands can completely derail the process for as long as they feel like. In my view, a functional specification working group should give more weight to those that are actually going to implement the specification and those who have a track record of actually being useful, much as the ASF pays more attention to contributors, committers and members, in that order.
Ben Laurie - Open Web Foundation
I’m very pleased that we’ve launched the Open Web Foundation today. As Scott Kveton says
The OWF is an organization modeled after the Apache Software Foundation; we wanted to use a model that has been working and has stood the test of time.
When we started the ASF, we wanted to create the best possible place for open source developers to come and share their work. As time went by, it became apparent that the code wasn’t the only problem - standards were, too. The ASF board (and members, I’m sure) debated the subject several times whilst I was serving on it, and no doubt still does, but we always decided that we should focus on a problem we knew we could solve.
So, I’m extra-happy that finally a group of community-minded volunteers have come together to try to do the same thing for standards.
Ben Laurie - Getting At Public Data
The government has quietly launched two quite fascinating initiatives. I have no idea why there wasn’t more fanfare. I was even at OpenTech, where one was announced, and I didn’t know!
Firstly, Show Us A Better Way
Ever been frustrated that you can’t find out something that ought to be easy to find? Ever been baffled by league tables or ‘performance indicators’? Do you think that better use of public information could improve health, education, justice or society at large?
The UK Government wants to hear your ideas for new products that could improve the way public information is communicated.
And 20 grand for the best ideas, too.
Secondly, The Public Sector Unlocking Service (Beta). I love that they put “Beta” in there. Tell them about crown copyright data some bureaucrat is hoarding, and they’ll read them the riot act. Awesome.
Ben Laurie - The Register on Security
So, The Register has a story on Mozilla doing security metrics. Which is cool.
But what tickles me is that The Register thinks I should download an Excel file to read more about the project. Yeah, right.
Ben Laurie - Caja Security Review
A few weeks ago, we invited a group of external security experts to come and spend a week trying to break Caja. As we expected, they did. Quite often. In fact, I believe a team member calculated that they filed a new issue every 5 minutes throughout the week.
The good news, though, was that nothing they found was too hard to fix. Also, their criticism has led to some rethinking about some aspects of our approach which we hope will make the next security review easier as well as Caja more robust.
You can read a summary of their findings.
Ben Laurie - Analysing Data Loss
My colleague, Steve Weis, has an interesting article analysing the Dataloss Database. With pictures!
Within accidental disclosures, 36% were due to improper disposal of media or computers. Surprisingly, 30% were due to leaks via snail mail.
Ben Laurie - ACTA, The Pirate Bay and BTNS
Doc Searls just pointed me at a couple of articles. The first is about ACTA.
ACTA, first unveiled after being leaked to the public via Wikileaks, has sometimes been lauded by its supporters as “The Pirate Bay-killer,” due to its measures to criminalize the facilitation of copyright infringement on the internet – text arguably written specifically to beat pirate BitTorrent trackers. The accord will place add internet copyright enforcement to international law and force national ISPs to respond to international information requests, and subjects iPods and other electronic devices to ex parte searches at international borders.
Obviously this is yet another thing we must resist. The Pirate Bay’s answer to this
IPETEE would first test whether the remote machine is supporting the crypto technology; once that’s confirmed it would then exchange encryption keys with the machine before transmitting your actual request and sending the video file your way. All data would automatically be unscrambled once it reaches your machine, so there would be no need for your media player or download manager to support any new encryption technologies. And if the remote machine didn’t know how to handle encryption, the whole transfer would fall back to an unencrypted connection.
is a great idea, but … its already been done by the IETF BTNS (Better-Than-Nothing Security) Working Group.
The WG has the following specific goals:
a) Develop an informational framework document to describe the motivation and goals for having security protocols that support anonymous keying of security associations in general, and IPsec and IKE in particular
Hmmm. I guess I should figure out how I switch this on. Anyone?
Ben Laurie - FreeBMD Gets New Boots
FreeBMD recently moved its servers from one of The Bunker’s data centres to the other.
Our marvellous sysadmin posted some pictures. It never fails to amaze me how much tin it takes to keep that crazy idea running.
Ben Laurie - ORG Report on E-counting
It seems like a long time since I spent a very long afternoon (and evening) observing the electronic count of the London Elections. Yesterday, the Open Rights Group released its report on the count. The verdict?
there is insufficient evidence available to allow independent observers to state reliably whether the results declared in the May 2008 elections for the Mayor of London and the London Assembly are an accurate representation of voters’ intentions.
There was lots of nice machinery and pretty screens to watch, but in my view three more things were needed to ensure confidence in the vote.
- A display that showed (a random selection of) ballots and the corresponding vote recorded automatically.
- No machines connected to the network that could not be observed.
- A commitment to the vote (I mean this in the cryptographic sense) after which a manual recount of randomly selected ballot boxes.
The last point is technically tricky to do properly, but I think it could be achieved. For example, take the hash of each ballot box’s count, then form a Merkle tree from those. Publish the root of the tree as the commitment, then after the manual recount, show that the hashes of the (electronic) counts for those boxes (which you would have to reveal anyway to verify the recount) are consistent with the tree.
Ben Laurie - ICANN Create Domain Cash Cow
Back when I used to serve on Nominet’s Policy Advisory Board, I used to find myself regularly arguing against the creation of new subdomains under .uk. Why? Because the only point I can actually see for creating a new subdomain is so that the registrars can make a huge pile of money while everyone scrambles to register in the new domain in order to protect their brand names.
Does anyone else benefit in any way? No. The registrants do not benefit: they already had domain names, they didn’t need any more. The public do not benefit: one domain name is quite sufficient for any Internet service.
So, given the complete pointlessness of doing this, I am not in the slightest surprised to hear that that most pointless of organisations, ICANN, has decided to allow approximately a zillion new TLDs.
In their usual egotistical style, they bill this piece of stupidity as…
Biggest Expansion to Internet in Forty Years Approved for Implementation
The only thing this expands is the wallets of registrars and, presumably, ICANN’s coffers. The Internet itself is not expanded one iota by this dumb move.
I guess the interesting thing to watch here is who manages to figure out the best TLDs to persuade people they need to register to protect themselves. “.trademark” sounds promising to me. “.name” would also be good. I invite your suggestions - perhaps we should form a consortium to register them, too.
Think I can get .ben? That would be cool.
Ben Laurie - Information Card Foundation Launched
Yet another industry alliance launches today: the Information Card Foundation (yes, I know that’s currently a holding page: as always, the Americans think June 24th starts when they wake up).
I have agreed to be a “Community Steering Member”, which means I sit on the board and get a vote on what the ICF does. Weirdly, I am also representing Google on the ICF board. I guess I brought that on myself.
I am not super-happy with the ICF’s IPR policy, though it is slightly better than the OpenID Foundation’s. I had hoped to get that fixed before launch, but there’s only so many legal reviews the various founders could put up with at short notice, so I will have to continue to tinker post-launch.
It is also far from clear how sincere Microsoft are about all this. Will they behave, or will they be up to their usual shenanigans? We shall see (though the adoption of a fantastically weak IPR policy is not the best of starts)! And on that note, I still wait for any sign of movement at all on the technology Microsoft acquired from Credentica - which they have kinda, sorta, maybe committed to making generally available. This is key, IMO, to the next generation of identity management systems and will only flourish if people can freely experiment with it. So what are they waiting for?
(More news reports than you can shake a stick at.)
Ben Laurie - Can Haz Blogroll
I’ve been meaning to put one of these up for years. Literally.
Anyway, I finally got around to it.
Ben Laurie - Using OpenID Responsibly
Some guy called Thomas asks the very reasonable question (where “this problem” is the OpenID phishing problem):
Too much of all of this discussion around OpenID focuses around whether or not it’s OpenID’s job to solve this problem, whether it is insecure, whether it promotes phishing, and so on. But none of the discussion focuses on what you should actually *do* when you care about making it easy for people to use your site while keeping security good enough.
Someone smart on the topic care to tell me what I should be doing as a website maker, and as a potential OpenID user on other websites ?
So, the answer to this is: you should only accept OpenID logins from providers that use unphishable authentication. How can you know what authentication they use? Well, right now you can’t, but a group of us are about to work on the OpenID Provider Authentication Policy Extension (a.k.a. PAPE) which will enable you to find out.
Until then, my answer continues to be “just say no”, if you are a website maker. If you are an OpenID user, then the answer is to find a provider that supports unphishable authentication - at least you will be safe, even if the rest of the world continues to suffer.
Ben Laurie - FF3: Better Late Than Never
Apparently there’s a launch party for Firefox 3 in London, open to all. Tonight.
Ben Laurie - Preprint: The Neb
Actually, this paper has a longer and sillier title, “Choose the Red Pill and the Blue Pill”. It was born at NDSS in a conversation with Abe Singer, my co-author.
The basic idea is that we have a choice between an operating system we can trust and one that is usable. A trustable system would be very bland and grey (the red pill) and a usable system would be full of fun and colour - but security would be a fantasy (the blue pill). In the paper, we discuss how to have your cake and eat it.
“The Neb” is the secure operating system we propose (short, of course, for the Nebuchadnezzar), btw.
Ben Laurie - Preprint: (Under)mining Privacy in Social Networks
Actually, I’m not sure if this one ends up in print or not. But anyway, I think its content is obvious from the title.
My colleagues Monica Chew and Dirk Balfanz did all the hard work on this paper.
Ben Laurie - Preprint: Access Control
I have three currently unpublished papers that may be of interest. This one has been submitted but not yet accepted. As you can guess from the title, it’s about access control, particularly in the area of mashups, gadgets and web applications.
This is the introduction:
Access control is central to computer security. Traditionally, we wish to restrict the user to exactly what he should be able to do, no more and no less.
You might think that this only applies to legitimate users: where do attackers fit into this worldview? Of course, an attacker is a user whose access should be limited just like any other. Increasingly, of course, computers expose services that are available to anyone — in other words, anyone can be a a legitimate user.
As well as users there are also programs we would like to control. For example, the program that keeps the clock correctly set on my machine should be allowed to set the clock and talk to other time-keeping programs on the Internet, and probably nothing else\footnote{Perhaps it should also be allowed a little long-term storage, for example to keep its calculation of the drift of the native clock.}.
Increasingly we are moving towards an environment where users choose what is installed on their machines, where their trust in what is installed is highly variable\footnote{A user probably trusts their
operating system more than their browser, their browser more than the pages they browse to and some pages more than others.} and where “installation” of software is an increasingly fluid concept,
particularly in the context of the Web, where merely viewing a page can cause code to run.In this paper I explore an alternative to the traditional mechanisms of roles and access control lists. Although I focus on the use case of web pages, mashups and gadgets, the technology is appliable to all access control.
And the paper is here.
Regular readers will not be surprised to hear I am talking about capabilities.
Ben Laurie - Modern Mail Clients
Way back when, I used to use Pine to read my email. After it had marked everything I read as unread again once too many times (admittedly not entirely its fault, but it did leave everything ’til the last minute), I switched to, well, something else. I don’t remember exactly what. But after a long series of experiments I ended up with Thunderbird, which I mostly like - or at least hate less than all other clients I’ve tried.
But, it really doesn’t handle big mailboxes very well. I’m lazy when it comes to tidying up, as my wife will testify, and so I tend to find myself with 100,000 read messages lying around and a similar number unread.
Thunderbird can read mailboxes like that (which is an improvement - earlier versions couldn’t), but it really doesn’t handle deleting them very well. Select even a small number, like a thousand or so, and hit delete, and watch Thunderbird go away for a very long sleep.
In the end I had to go back to Pine to tidy my mailbox. Incidentally, I tried mutt, but it couldn’t handle more than a few thousand messages at a time. Pine seems to manage whatever I throw at it, though its UI can only be described as arcane.
So, my question to the lazyweb: is there an answer to this? A modern open source client that can do graphical stuff, is nice to use and can handle big IMAP mailboxes? Or is my Thunderbird/Pine hybrid as good as it gets?
Ben Laurie - Picnic Cous Cous
I invented this to eat at Glyndebourne, doncha know.
Olive oil
Onion, chopped
Cumin, dry fried and ground
Coriander, dry friend and ground
Ginger, finely chopped
Chicken breast (might also be nice with thigh), roughly chopped
Dried apricots, soaked
Raisins
Chicken stock
Pistachios, shelled and roasted
Pine nuts, roasted
Coriander leaf
Parsley
Cous cous
Although I describe the nuts as roasted I actually did them in a dry frying pan over a low heat, stirring constantly. Likewise the spices (of which you need a lot).
Gently fry the onions in olive oil until soft. Add the ground spices and ginger, increase the heat a little, stir and fry for a couple of minutes. Add the chopped chicken breast and cook, stirring occasionally, until mostly done. Add the liquid from the soaked apricots, roughly chop the apricots and add them, plus the raisins and a little concentrated chicken stock (I use a liquid concentrate). If there isn’t enough water, then add some more, but you don’t need to even cover the chicken. Salt and pepper to taste. Bring to the boil, cover and simmer, stirring occasionally and breaking up the larger chicken pieces with your spoon. After about 30 minutes, turn off and leave to cool (overnight if you wish).
Then prepare the cous cous according to its instructions. Once it is ready, mix in the cooled (and thickened, there should be no free liquid by now) chicken stuff, the nuts, lots of chopped coriander and a little chopped parsley. Then wrap it up and take it to your picnic. We had a green salad and a potato salad with it.
I was planning to offer lime wedges for squeezing over it, but I forgot.
Also nice microwaved if you have leftovers.
Ben Laurie - Exploiting Network Cards
A friend of mine, Arrigo Triulzi (no web page that he wants to admit to), has just posted this fantastically scary missive to the Robust Open Source mailing list (no public archive, so I will quote it in its entirety)
I’ve been working on firmware for the past two and a bit years, in particular in the field of firmware viruses.
Without needlessly boring everyone with the various steps allow me to share an interesting observation: drivers often assume the hardware is misbehaved but never malicious. It is fascinating to discover what can be done by making the hardware malicious.
Summarising briefly my work, as yet unpublished except the obligatory notices to the affected vendors (in what follows please read NIC as strictly wired, no wireless cards):
1) there are remarkably naive “protection” methods to prevent malicious users from overwriting NIC firmware with something of their choice,
2) as an extension to 1) above it is amazing to discover how simply firmware can be updated over the wire on specific NICs,
3) from 1 & 2 above, after about two years, I’ve reached my goal of writing a totally transparent firewall bypass engine for those firewalls which are PC-based: you simply overwrite the firmware in both NICs and then perform PCI-to-PCI transfers between the two cards for suitably formatted IP packets (modern NICs have IP “offload engines” in hardware and therefore can trigger on incoming and outgoing packets). The resulting “Jedi Packet Trick” (sorry, couldn’t resist) fools, amongst others, CheckPoint FW-1, Linux-based Strongwall, etc. This is of course obvious as none of them check PCI-to-PCI transfers,
4) I have extended the technique to provide VM escape support: one writes packets from a bridged guest into the network which initiates the NIC firmware update, updates the firmware and then the NIC firmware is used to inject code into the underlying VM host. The requirement to write to the network is then dropped as all that is required is the pivoting in the NIC firmware.
This scares the crap out of me, just as it stands. But he’s missed a trick, IMO: because of the nature of the PCI bus, you can use the same technique on any machine with a vulnerable NIC to read all of RAM. You might even be able to read disk, too, depending on the disk controller.
Oh boy, this is going to be a can of worms once exploits start appearing (if they haven’t already, that is).
Ben Laurie - Debian and OpenSSL: The Last Word?
I am reliably informed that, despite my previous claim, at least one member of the OpenSSL team does read openssl-dev religiously. For which he should be commended. I read it sometimes, too, but not religiously.
So, forget I said that you don’t reach the OpenSSL developers by posting on openssl-dev.
Ben Laurie - Debian and OpenSSL: The Aftermath
There have been an astonishing number of comments on my post about the Debian OpenSSL debacle, clearly this is a subject people have strong feelings about. But there are some points raised that need addressing, so here we go.
Firstly, many, many people seem to think that I am opposed to removing the use of uninitialised memory. I am not. As has been pointed out, this leads to undefined behaviour - and whilst that’s probably not a real issue given the current state of compiler technology, I can certainly believe in a future where compilers are clever enough to work out that on some calls the memory is not initialised and take action that might be unfortunate. I would also note in passing that my copy of K&R (second edition) does not discuss this issue, and ISO/IEC 9899, which some have quoted in support, rather post-dates the code in OpenSSL. To be clear, I am now in favour of addressing this issue correctly.
And this leads me to the second point. Many people seem to be confused about what change was actually made. There were, in fact, two changes. The first concerned a function called ssleay_rand_add(). As a developer using OpenSSL you would never call this function directly, but it is usually (unless a custom PRNG has been substituted, as happens in FIPS mode, for example) called indirectly via RAND_add(). This call is the only way entropy can be added to the PRNG’s pool. OpenSSL calls RAND_add() on buffers that may not have been initialised in a couple of places, and this is the cause of the valgrind warnings. However, rather than fix the calls to RAND_add(), the Debian maintainer instead removed the code that added the buffer handed to ssleay_rand_add() to the pool. This meant that the pool ended up with essentially no entropy. Clearly this was a very bad idea.
The second change was in ssleay_rand_bytes(), a function that extracts randomness from the pool into a buffer. Again, applications would access this via RAND_bytes() rather than directly. In this function, the contents of the buffer before it is filled are added to the pool. Once more, this could be uninitialised. The Debian developer also removed this call, and that is fine.
The third point: several people have come to the conclusion that OpenSSL relies on uninitialised memory for entropy. This is not so. OpenSSL gets its entropy from a variety of platform-dependent sources. Uninitialised memory is merely a bonus source of potential entropy, and is not counted as “real” entropy.
Fourthly, I said in my original post that if the Debian maintainer had asked the developers, then we would have advised against such a change. About 50% of the comments on my post point to this conversation on the openssl-dev mailing list. In this thread, the Debian maintainer states his intention to remove for debugging purposes a couple of lines that are “adding an unintialiased buffer to the pool”. In fact, the first line he quotes is the first one I described above, i.e. the only route to adding anything to the pool. Two OpenSSL developers responded, the first saying “use -DPURIFY” and the second saying “if it helps with debugging, I’m in favor of removing them”. Had they been inspired to check carefully what these lines of code actually were, rather than believing the description, then they would, indeed, have noticed the problem and said something, I am sure. But their response can hardly be taken as unconditional endorsement of the change.
Fifthly, I said that openssl-dev was not the way to ensure you had the attention of the OpenSSL team. Many have pointed out that the website says it is the place to discuss the development of OpenSSL, and this is true, it is what it says. But it is wrong. The reality is that the list is used to discuss application development questions and is not reliably read by the development team.
Sixthly, my objection to the fix Debian put in place has been misunderstood. The issue is not that they did not fully reverse their previous patch - as I say above, the second removal is actually fine. My issue is that it was committed to a public repository five days before an advisory was issued. Only a single attacker has to notice that and realise its import in order to start exploiting vulnerable systems - and I will be surprised if that has not happened.
I think that’s about enough clarification. The question is: what should we do to avoid this happening again? Firstly, if package maintainers think they are fixing a bug, then they should try to get it fixed upstream, not fix it locally. Had that been done in this case, there is no doubt none of this would have happened. Secondly, it seems clear that we (the OpenSSL team) need to find a way that people can reliably communicate with us in these kinds of cases.
The problem with the second is that there are a lot of people who think we should assist them, and OpenSSL is spectacularly underfunded compared to most other open source projects of its importance. No-one that I am aware of is paid by their employer to work full-time on it. Despite the widespread use of OpenSSL, almost no-one funds development on it. And, indeed, many commercial companies who absolutely depend on it refuse to even acknowledge publicly that they use it, despite the requirements of the licence, let alone contribute towards it in any way.
I welcome any suggestions to improve this situation.
Incidentally, some of the comments are not exactly what I would consider appropriate, and there’s a lot of repetition. I moderate comments on my blog, but only to remove spam (and the occasional cockup, such as people posting twice, not realising they are being moderated). I do not censor the comments, so don’t blame me for their content!
Ben Laurie - Vendors Are Bad For Security
I’ve ranted about this at length before, I’m sure - even in print, in O’Reily’s Open Sources 2. But now Debian have proved me right (again) beyond my wildest expectations. Two years ago, they “fixed” a “problem” in OpenSSL reported by valgrind[1] by removing any possibility of adding any entropy to OpenSSL’s pool of randomness[2].
The result of this is that for the last two years (from Debian’s “Etch” release until now), anyone doing pretty much any crypto on Debian (and hence Ubuntu) has been using easily guessable keys. This includes SSH keys, SSL keys and OpenVPN keys.
What can we learn from this? Firstly, vendors should not be fixing problems (or, really, anything) in open source packages by patching them locally - they should contribute their patches upstream to the package maintainers. Had Debian done this in this case, we (the OpenSSL Team) would have fallen about laughing, and once we had got our breath back, told them what a terrible idea this was. But no, it seems that every vendor wants to “add value” by getting in between the user of the software and its author.
Secondly, if you are going to fix bugs, then you should install this maxim of mine firmly in your head: never fix a bug you don’t understand. I’m not sure I’ve ever put that in writing before, but anyone who’s worked with me will have heard me say it multiple times.
Incidentally, while I am talking about vendors who are bad for security, it saddens me to have to report that FreeBSD, my favourite open source operating system, are also guilty. Not only do they have local patches in their ports system that should clearly be sent upstream, but they also install packages without running the self-tests. This has bitten me twice by installing broken crypto, most recently in the py-openssl package.
[1] Valgrind is a wonderful tool, I recommend it highly.
[2] Valgrind tracks the use of uninitialised memory. Usually it is bad to have any kind of dependency on uninitialised memory, but OpenSSL happens to include a rare case when its OK, or even a good idea: its randomness pool. Adding uninitialised memory to it can do no harm and might do some good, which is why we do it. It does cause irritating errors from some kinds of debugging tools, though, including valgrind and Purify. For that reason, we do have a flag (PURIFY) that removes the offending code. However, the Debian maintainers, instead of tracking down the source of the uninitialised memory instead chose to remove any possibility of adding memory to the pool at all. Clearly they had not understood the bug before fixing it.
P.S. I’d link to the offending patch in Debian’s source repository. If I could find a source repository. But I can’t.
(Update)
Thanks to Cat Okita, I have now found the repo. Here’s the offending patch. But I have to admit to being astonished again by the fix, which was committed five days before the advisory! Do these guys have no clue whatsoever?
Ben Laurie - The World Without “Identity” or “Federation” is Already Here
My friend Alec Muffett thinks we should do away with “Big I” Identity. I’m all for that … but Alec seems to be quite confused.
Firstly, his central point, that all modern electronic identity requires the involvement of third parties, is just plain wrong. OpenID, which he doesn’t mention, is all about self-asserted identity - I put stuff on webpages I own and that’s my identity. Cardspace, to the extent it is used at all, is mostly used with self-signed certificates - I issue a new one for each site I want to log in to, and each time I visit that site I prove again that I own the corresponding private key. And, indeed, this is a pretty general theme through the “user-centric” identity community.
Secondly, the idea that you can get away with no third party involvement is just unrealistic. If everyone were honest, then sure, why go beyond self-assertion? But everyone is not. How do we deal with bad actors? Alec starts off down that path himself, with his motorcycling example: obviously conducting a driving test on the spot does not scale well - when I took my test, it took around 40 minutes to cover all the aspects considered necessary to establish sufficient skill, and I’d hesitate to argue that it could be reduced. The test used to be much shorter, and the price we paid was a very high death rate amongst young motorcyclists; stronger rules have made a big inroads on that statistic. It is not realistic to expect either me or the police to spend 40 minutes establishing my competence every time it comes into question. Alec appears to be recognising this problem by suggesting that the officer might instead rely on the word of my local bike club. But this has two problems, firstly I am now relying on a third party (the club) to certify me, which is exactly counter to Alec’s stated desires, and secondly, how does one deal with clubs whose only purpose is to certify people who actually should not be allowed to drive (because they’re incompetent or dangerous, for example)?
The usual answer one will get at this point from those who have not worked their way through the issues yet is “aha, but we don’t need a central authority to fix this problem, instead we can rely on some kind of reputation system”. The trouble is no-one has figured out how you build a reputation system in cyberspace (and perhaps in meatspace, too) that is not easily subverted by people creating networks of “fake” identities purely in order to boost their own reputations - at least, not without some kind of central authority attesting to identity.
Yet another issue that has to be faced is what to do about negative attributes (e.g. “this guy is a bad risk, don’t lend him money because he never pays it back”). No-one is going to willingly make those available to others. Once more, we end up having to invoke some kind of authority.
Of course, there are many cases where self-assertion is perfectly fine, so I have no argument with Alec there. And yes, there is a school of thought that says any involvement with self-issued stuff is a ridiculous idea, but you mostly run into that amongst policy people, who like to think that we’re all too stupid to look after ourselves, and corporate types who love silos (we find a lot of those in the Liberty Alliance and the ITU and such-like places, in my experience).
But the bottom line is that a) what he wants is insufficient to completely deal with the problems of identity and reputation and b) it is nothing that plenty of us haven’t been saying (and doing) all along - at least where it works.
Once you’ve figured that out, you realise how wrong
I am also here not going to get into the weirdness of Identity wherein the goal is to centralise your personal information to make management of it convenient, and then expend phenomenal amounts of brainpower implementing limited-disclosure mechanisms and other mathematica, in order to re-constrain the amount of information that is shared; e.g. “prove you are old enough to buy booze without disclosing how old you are”. Why consolidate the information in the first place, if it’s gonna be more work to keep it secret henceforth? It’s enough to drive you round the twist, but it’ll have to wait for a separate rant.
is. Consolidation is not what makes it necessary to use selective disclosure - that is driven by the need for the involvement of third parties. Obviously I can consolidate self-asserted attributes without any need for selective disclosure - if I want to prove something new or less revealing, I just create a new attribute. Whether its stored “centrally” (what alternative does Alec envision, I wonder?) or not is entirely orthogonal to the question.
Incidentally, the wit that said “Something you had, Something you forgot, Something you were” was the marvellous Nick Mathewson, one of the guys behind the Tor project. Also, Alec, if you think identity theft is fraud (as I do), then I recommend not using the misleading term preferred by those who want to shift blame, and call it “identity fraud” - in fraud, the victim is the person who believes the impersonator, not the person impersonated. Of course the banks would very much like you to believe that identity fraud is your problem, but it is not: it is theirs.
Ben Laurie - Petition Against Unfair Motorcycle Tax
Not much I can add to the petition’s own words! Sign up here.
Changes to the law mean cars emitting less than 100g of CO2 per kilometre travelled would be exempt from paying Vehicle Excise Duty (road tax), while motorcycles are still required to pay.
This was outlined by your Chancellor Alistair Darling in his first budget last week, under the auspices of rewarding motorists for driving ‘green’ vehicles.
Despite Darling’s aim, the rate of road tax paid by motorcyclists is set to double in 2009, with the annual charge for a typical 125cc commuter bike set to grow from £15 per year at present, to £33 in 2009.
This makes a nonsense of the revised rates of vehicle excise duty, as motorcycles tend to emit less CO2 and use less fuel than cars, with the average CO2 output from motorcycles at 110g/km.
So why do those who ride greener two wheeled vehicles, use less road space and do not contribute to congestion get penalised whilst 4 wheel motorist whose vehicles use under 100g/km are exempt from road tax
Ben Laurie - Fun With FreeBSD and gmirror
A while ago I moved a lot of my stuff from a very ancient box to a quite new one. For some reason the new one has three disks in it, and so we (that is the ultra-patient Lemon and me) decided to mirror two of them. Not really having need of a third enormous disk it was left spare for now (possibly this was unwise in retrospect).
Since I run FreeBSD on my server boxes, we used gmirror. Being adventurous, we also decided we were going to mirror the root partition - slightly nerve-wracking, because when FreeBSD boots, it boots from the (unmirrored) root partition. But the theory is this works fine with mirrored disks.
So, we had three disks, which FreeBSD saw as ad4 (ata2-master), ad5 (ata2-slave) and ad6(ata3-master). We figured that ad4 and ad6 should be the mirrors, since they are on different controllers. So that’s what we did and it all works fine.
Fast forward several months and its time to upgrade the kernel. We’re moving from FreeBSD 6.x to FreeBSD 7.x, so its slightly nerve-wracking, but I do what I always do, which is to build the system from source, following the time-honoured
1. `cd /usr/src' (or to the directory containing your source tree).
2. `make buildworld'
3. `make buildkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
4. `make installkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
[steps 3. & 4. can be combined by using the "kernel" target]
5. `reboot’ (in single user mode: boot -s from the loader prompt).
6. `mergemaster -p’
7. `make installworld’
8. `make delete-old’
9. `mergemaster’
10. `reboot’
(btw, mergemaster -U is good medicine for step 9). Everything goes fine. Until I realise uname -a is still reporting we’re running FreeBSD 6.2! WTF?
Well, to make a long story short, for some reason the BIOS thinks ata2-slave(i.e. our “spare” disk!) is the “first” disk, and so this is what it boots off. Presumably during the build the system got installed on “disk 1″, whichever that happened to be (we didn’t actually do the base build ourselves). Then when mirroring was set up, our “disk 1″ didn’t match the BIOSes, and confusion reigned.
The happy ending to this story, though, is that
- You can run FreeBSD 7 userland on a FreeBSD 6.2 kernel!
- Switching the BIOS to boot off “disk 2″ (i.e. ata2-master) made everything work as it should
I am recording this episode not because I think it is very interesting but because I hope it’ll be useful to someone else.
Ben Laurie - Do We Need Credentica?
I read that IBM have finally contributed Idemix to Higgins.
But … I am puzzled. Everyone knows that the reason Idemix has not been contributed sooner is because it infringes the Credentica patents. At least, so says Stefan - I wouldn’t know, I haven’t checked. But it seems plausible that at least IBM think that’s true.
So, what’s changed? Have IBM decided that Idemix does not infringe? Or did Microsoft let them publish? Or what?
If its the former, then do others agree? And if its the latter, then in what sense is this open source? If IBM have some kind of special permission with regard to the patents, that is of no assistance to the rest of us.
It seems to me that someone needs to do some explaining. But if the outcome is that Idemix really is open source, then what is the relevance of Credentica?
Incidentally, I wanted to take a look at what it is that IBM have actually released, but there doesn’t seem to be anything there.
Ben Laurie - Can Phorm Intercept SSL?
Someone asked me to comment on a thread over at BadPhorm on SSL interception.
In short, the question is: can someone in Phorm’s position decrypt SSL somehow? The fear is driven by the existence of appliances that do just this. But these appliances need to do one of two special things to work.
The first possibility is where the appliance is deployed in a corporate network to monitor traffic going from browsers inside the corporation to SSL servers outside. In this case, what you do is you have the SSL appliance act as a CA, and you install its CA certificate in each browser’s store of trusted CAs. Then when the appliance sees an SSL request go past it quickly creates (some would say “forges”) a certificate for the server the request is destined for and instead of routing the connection on to the real server, instead answers it itself, using the newly created certificate. Because the browser trusts the appliance’s CA this all looks perfectly fine and it will proceed without a warning. The appliance then creates an outgoing connection to the real server and acts as a proxy between the browser and server, thus getting access to the plaintext of the interaction.
I’d note in passing that in Netronome’s diagram they show a “trust relationship” between the webserver and the SSL appliance. This is not correct. There need be no relationship at all between the webserver and the appliance - indeed it would be fair to say that many a webserver operator would view what the appliance is doing as downright sneaky. Or dishonest, even.
But, in any case, inside the corporation this behaviour seems fair enough to me - they’re paying for the browser, the machine it runs on, the network connection and the employee’s time. I guess they have a right to see the data.
Could Phorm do this? Well, they could try to persuade anyone stupid enough to install a CA certificate of theirs in their browser, and then yes, indeed, this trick would work for them. More of the story: don’t install such certificates. Note that last time I looked if you wanted to register to do online returns for VAT you had to install one of these things. Oops!
Or, they could get certified as a CA and get automatically installed in everyone’s browser. I’m pretty sure, however, that such a use of a CA key would find them in breach of the conditions attached to their certification.
So, in short, Phorm can only do this to people who don’t understand what’s going on - i.e. 99% of Internet users. But not me.
The second scenario is to deploy the SSL interception appliance at the webserver end of the network (at least, this is how its usually done), and have it sniff incoming connections to the webserver. However, to break these connections it needs to have a copy of the webserver’s private key. I’m reasonably confident that the vast majority of webserver operators will not be handing over their private keys to Phorm, so even “99%” users are safe from this attack.
By the way, if you want to see this one in action, then you can: the excellent network sniffer, Wireshark, can do it. Full instructions can be found here. No need to buy an expensive appliance.
Ben Laurie - Yet Another Version Control System (and an Apache Module)
I recently finished off mod_digest for Canonical. To you: the guys that make Ubuntu.
In the process I was forced to use yet another distributed version control system, Bazaar. Once I’d figured out that the FreeBSD port was devel/bazaar-ng and not devel/bazaar, I quite liked it. All these systems are turning out to be pretty much the same, so it’s the bells and whistles that matter. In the case of Bazaar the bell (or whistle) I liked was this
$ bzr push
Using saved location: sftp://ben-links@bazaar.launchpad.net/~ben-links/apache-mod-digest/bensbranch/
Yes! In Monotone, I’m permanently confused about branches and repos and, well, stuff. Mercurial makes me edit a config file to set a default push location. Bazaar remembers what I did last time. How obvious is that?
Ben Laurie - Why You Should Always Use End-to-End Encryption
A Twitter user has had all her private messages exposed to the world. This is one of the reasons I try to avoid sending private messages (at least, ones that I would like to remain private) over any system that does not employ end-to-end encryption.
At least then my only exposure is to my correspondent, not the muppets that run the messaging service I used.
One service this poor unfortunate has done for the world, though, is to provide an excellent example of why you should use cryptography routinely: you need not have any more to hide than your embarrassment.
Incidentally, I am going to stop using the combined tag “Anonymity/Privacy” after this post - clearly they are not always both applicable.
Ben Laurie - Phorm Legal Analysis
FIPR’s Nick Bohm has written a fascinating legal analysis of Phorm’s proposed system. Its nice that RIPA’s effects are not all bad, but it turns out that, in Nick’s opinion, Phorm are on the hook for a number of other illegal acts under various acts…
- The Regulation of Investigatory Powers Act 2000
- The Fraud Act 2006
- The Data Protection Act 1998
He also beats up Simon Watkin of the Home Office (well-known in UK privacy circles for spending a great deal of energy trying to persuade us all that RIPA [then known as RIP] was going to be alright, really), for a note he wrote which suggested that Phorm’s business model was just fine under RIPA. Simon stays true to form by pointing out that the note wasn’t actually advice, and was not based on paying any attention at all to what Phorm were actually proposing. One has to wonder, then, what the point of writing it was?
Perhaps more disturbingly, Nick also talks about what my be the first attempt at enforcement against Phorm. Not surprisingly, the police say they’re too busy and it’s the Home Office’s problem and the Home Office say its not their job to investigate offences under RIPA. Isn’t it lucky, then, that we are doing their investigating for them?
I’m also pleased to see that Nick supports my view that the consent of both the user and the web server must be obtained for Phorm’s interception to be legal under RIPA
RIPA s3(1) makes it lawful if the interception has the consent of both
sender and recipient (or if the interceptor has reasonable grounds for believing
that it does). This raises the question of whose consent is required for the
interception of communications of those using web browsers.
I’m also intrigued by Nick’s analysis of Phorm’s obligation under the Data Protection Act. Where sensitive personal data is processed by Phorm, then the user’s consent must be obtained. Nick argues that Phorm will see information relating to
• their racial or ethnic origin,
• their political opinions,
• their religious or similar beliefs,
• whether they are members of a trade union,
• their physical or mental health or condition,
• their sexual life,
• the commission or alleged commission by them of any offence, or
• any proceedings for any offence committed or alleged to have been
committed by them, the disposal of such proceedings or the sentence of
any court in such proceedings
It occurs to me that Nick has missed a trick here: the user might also view sensitive data relating to a third party - for example, they might participate in a closed web forum where, say, sexual preferences are discussed. In this case, it seems to me, the consent of that third party would need to be obtained by Phorm.
Ben Laurie - Oyster is Toast
The MiFare stream cipher, as used in Oyster cards, has been comprehensively cracked. The researchers claim they can recover the key in well under 5 minutes after observing a single transaction.
When will people learn that making up your own ciphers is a fantastically bad idea?
Ben Laurie - Nice Review of Caja
Tim Oren posted about Caja.
…this adds up to a very good chance that something that’s right now fairly obscure could turn into a major force in Web 2.0 within months, not years. Because Caja modifies the de facto definition of JavaScript, it would have an immediate impact on any scripts and sites that are doing things regarded as unsafe in the new model. If you’ve got a Web 2.0 based site, get ready for a project to review for ‘Caja-safety’. If the Caja model spreads, then the edges of the sandbox are going to get blurry. Various users and sites will be able to make choices to allow more powerful operations, and figuring out which ones are significant and allow enhanced value could be a fairly torturous product management challenge, and perhaps allow market entry chances for more powerful forms of widgets and Facebook-style ‘apps’.
End of message.
Ben Laurie - Conflicting Roles
Pamela Dingle writes about the problems of people having conflicting roles. Funnily enough I’m working on a paper about roles, too, but more on that later. Right now I wanted to observe that the problem she describes
There is no simple way to say that John is a broker 100% of the time, but 50% of the time he represents Client A and only Client A, and the other 50% he solely represents Client B. There is no way to represent mutual exclusivity of roles in a single user profile (that I’m aware of).
can handled in an interesting way in SE-Linux: there you can make the rule that once the user (or rather, a program acting on behalf of the user) has accessed any resource corresponding to Client A he is no longer allowed to access resources corresponding to Client B, and vice versa. Of course, leaping from this to the idea that you’ve built a real Chinese Wall between the two clients is falling foul of one of the fallacies of DRM: of course the user can find ways to transport data across that wall. But, nevertheless, SE-Linux is a system in which it is possible to express such policies.
Ben Laurie - More Bullshit from Phorm
Phorm continue to sob that us whining privacy advocates are misrepresenting their system
Phorm’s chairman and chief executive, Kent Ertugrul, said yesterday the firm was the victim of misinformation. “What is so strange about this is that if you were to put on a board what we do and what has been written about us and map the two, you would find there is very little correlation,” he said.
I’d be more than happy to compare what I’ve said to what their system actually does, only … when the Open Rights Group nominated me to be briefed by Phorm (in my capacity as both a director of ORG and a subject matter expert) they declined, on the basis that I work for a competitor, despite my assurance that I would not be acting for Google in any way, as is always the case when I do stuff for ORG. But, hey, trust is a one-way street, apparently, if you are Phorm - as one of the surveilled, I must trust them, but that’s no reason they should trust me, is it?
Strangely they were quite happy to brief two of my colleagues in detail, without any NDA - and my colleagues are planning to produce a full, public report of that briefing. With a bit of luck, they’ll have addressed all my concerns, but who knows? I wasn’t there to assist in that process.
Interestingly, they go on to say
“What we would like to do is issue a challenge to the privacy community to select some of their most technically savvy representatives and form an inspection committee. We would be delighted, on a recurring basis, to give those people the ability to spot inspect what it is we do.”
which rather emphasizes one of the core problems with their system: it requires everyone to trust that all this data they have gathered without consent is actually handled as they claim it is handled.
I do hope Phorm will be paying the going rate for this valuable service - but probably I won’t find out because I expect that, despite my obvious qualifications, I will be excluded from such a group. It wouldn’t do to have anyone too expert looking at their system, after all.
Ben Laurie - Microsoft Implement The Evil Bit
Thanks to the Shindig mailing list, I’ve just noticed this gem from Microsoft.
The essence here is that third party sites inside frames might invade your privacy by setting cookies, so IE6, by default, doesn’t let them set cookies. But, if they promise to be good, then it will allow them to be bad. Isn’t that marvellous?
What I think is particularly excellent about Microsoft’s support article is that they tell you how to suppress the behaviour by setting an appropriate P3P policy … but they don’t tell you what this policy really means, nor suggest that you should only set the policy if you actually conform to it.
Of course, you can tell it’s a Microsoft protocol because it takes 21 bytes to do what the original proposal could do in a single bit.
Ben Laurie - Federated Messaging Meets Federated Identity
XMPP, OAuth and OpenID. Social networking in real-time. Interesting. Peter Saint-Andre thinks we should talk about it.
Sign up here.
Ben Laurie - Fun With Dot
As I’ve mentioned before, people don’t really talk much about the experience of writing and debugging code, so here’s another installation in my occasional series on doing just that.
Over the Easter weekend the weather has been pretty horrible, so, instead of having fun on my motorbike, I’ve been amusing myself in various ways: trying to finish up a paper I started last year, doing further work on OpenSSL and Deputy, cooking, playing with Tahoe (on which more later), updating my FreeBSD machines, and messing around with Graphviz.
Graphviz is one of my favourite toys. Basically, it lets you specify how a bunch of things are connected, and then it will draw them for you. A project I’d long had in my head was to take all the RFCs, work out which ones reference which other ones and have Graphviz draw it for me. Getting the data turned out to be pretty easy, but unfortunately the resulting dataset proves to be too much for poor old Graphviz, exposing all sorts of bugs in its drawing engine, which led to core dumps and/or complete garbage in the output files. Shame, early experiments promised some quite pretty output. Anyway, after banging my head against that for many hours, I gave up and instead did something I do every few months: got my various FreeBSD machines up-to-date. As part of that process, I had to look at the stuff that FreeBSD runs at startup, configured in /etc/rc.conf (and /etc/defaults/rc.conf), and actually done by scripts in /etc/rc.d and /usr/local/etc/rc.d.
This reminded me that these scripts expose their dependencies in comments, like this (from /etc/rc.d/pfsync)
# PROVIDE: pfsync
# REQUIRE: root FILESYSTEMS netif
So, I thought it would be fun to graph those dependencies - then at least I’d have one pretty thing to show for the weekend. Then, since it only took 15 minutes to do, I thought it might make an interesting subject for a post on how I go about coding such things.
So, first things first, I like some instant gratification, so step one is to eat the rc files and see if I can parse them. I wasn’t quite sure if /etc/rc.d had subdirectories, but since I had some code already to read all files in a directory and all its subdirectories (from my failed attempt at the RFCs) I just grabbed that and edited it slightly:
sub findRCs { my $dir = shift; local(*D); opendir(D, $dir) || croak "Can't open $dir: $!"; while(my $f = readdir D) { next if $f eq '.' || $f eq '..'; my $file="$dir/$f"; if(-d $file) { findRCs($file); next; } readRC($file); } }
This will call readRC for each file in the provided directory. My first version of readRC looked like this:
sub readRC { my $file = shift; my $rc = read_file($file); my($provide) = $rc =~ /^\# PROVIDE: (\S+)$/m; croak "Can't find PROVIDE in $file" if !$provide; print "$file: $provide\n"; }
Note that I assume that each file PROVIDEs only one thing, since I match \S+ (i.e. 1 or more non-whitespace characters), and force the matched string to span a whole line. This starts off well
/etc/rc.d/accounting: accounting
/etc/rc.d/amd: amd
/etc/rc.d/addswap: addswap
.
.
.
but ends
Can't find PROVIDE in /etc/rc.d/NETWORKING at ./graph-rc.pl line 13
main::readRC('/etc/rc.d/NETWORKING') called at ./graph-rc.pl line 30
main::findRCs('/etc/rc.d') called at ./graph-rc.pl line 35
oops. If we look at the offending file, we see
# PROVIDE: NETWORKING NETWORK
# REQUIRE: netif netoptions routing network_ipv6 isdnd ppp
# REQUIRE: routed mrouted route6d mroute6d resolv
OK, so it provides two things, it seems. Fair enough, I can fix that, I just have to elaborate the matching slightly
my($provide) = $rc =~ /^\# PROVIDE: (.+)$/m;
croak “Can’t find PROVIDE in $file” if !$provide;my @provide = split /\s/,$provide;
print “$file: “, join (’, ‘,@provide), “\n”;
In other words, match everything after PROVIDE: and then split it on whitespace. Notice that this file also has multiple REQUIRE lines - lucky I noticed that, it could easily have escaped my attention. Anyway, after this modification, I can read the whole of /etc/rc.d. Now I need to match the requirements, which I do like this
my(@lrequire) = $rc =~ /^# REQUIRE: (.+)$/mg;
my @require = split /\s/, join(' ', @lrequire);
Another test, just printing what I extracted (print ' ', join (', ',@require), "\n";) and this seems to work fine. So far I’ve only been testing with /etc/rc.d, but now I’m almost ready to start graphing, I also test /usr/local/etc/rc.d…
Can't find PROVIDE in /usr/local/etc/rc.d/sshd_localhost.sh at ./graph-rc.pl line 13
main::readRC('/usr/local/etc/rc.d/sshd_localhost.sh') called at ./graph-rc.pl line 35
main::findRCs('/usr/local/etc/rc.d') called at ./graph-rc.pl line 40
OK, so this is a very old rc file of my own and it has no require/provides stuff. In fact, it totally departs from the spec. Whatever … I decide to just skip files that don’t include REQUIRE
if($rc !~ /PROVIDE/) { print STDERR "Skipping $file\n"; return; }
A quick test confirms that it only skips that one file, and now everything works. OK, so time to graph! All I need to do is generate a file in a format Graphviz can read, which is amazingly easy. First I have to output a header
print "digraph rcs {\n";
print " node [fontname=\"Courier\"];\n";
then a line for each dependency
foreach my $p (@provide) { foreach my $r (@require) { print " $r -> $p; \n"; }
and finally a trailer
print "}\n";
This produces a file that looks like this
digraph rfcs { node [fontname="Courier"]; mountcritremote -> accounting; rpcbind -> amd; ypbind -> amd; nfsclient -> amd; cleanvar -> amd; . . . }
which I can just feed to dot (one of the Graphviz programs), like so
dot -v -Tpng -o ~/tmp/rc.png /tmp/xx
and I get a lovely shiny graph. But while I’m admiring it, I notice that ramdisk has a link to itself, which seems a bit rum. On closer inspection, /etc/rc.d/ramdisk says
# PROVIDE: ramdisk
# REQUIRE: localswap
which doesn’t include a self-reference. Odd. Looking at the output from my script I notice
ramdisk -> ramdisk-own;
Guessing wildly that dot doesn’t like the “-”, I modify the output slightly
foreach my $p (@provide) { foreach my $r (@require) { print " \"$r\" -> \"$p\"; \n"; } } }
and bingo, it works. Putting it all together, here’s the final script in full
#!/usr/bin/perl -w use strict; use File::Slurp; use Carp; sub readRC { my $file = shift; my $rc = read_file($file); if($rc !~ /PROVIDE/) { print STDERR "Skipping $file\n"; return; } my($provide) = $rc =~ /^\# PROVIDE: (.+)$/m; croak "Can't find PROVIDE in $file" if !$provide; my @provide = split /\s/, $provide; my(@lrequire) = $rc =~ /^# REQUIRE: (.+)$/mg; my @require = split /\s/, join(' ', @lrequire); foreach my $p (@provide) { foreach my $r (@require) { print " \"$r\" -> \"$p\"; \n"; } } } sub findRCs { my $dir = shift; local(*D); opendir(D, $dir) || croak "Can't open $dir: $!"; while(my $f = readdir D) { next if $f eq '.' || $f eq '..'; my $file="$dir/$f"; if(-d $file) { findRCs($file); next; } readRC($file); } } print "digraph rfcs {\n"; print " node [fontname=\"Courier\"];\n"; while(my $dir=shift) { findRCs($dir); } print "}\n";
and running it
./graph-rc.pl /etc/rc.d /usr/local/etc/rc.d > /tmp/xx
dot -v -Tpng -o ~/tmp/rc.png /tmp/xx
And finally, here’s the graph. Interesting that randomness is at the root!
Ben Laurie - Am I Reassured?
Mike Jones (of Microsoft) tells me I was wrong to be worried about Microsoft’s Open Specification Promise. I wasn’t actually that worried, but now I am.
He says
The “analysis” tries to insinuate that since Microsoft doesn’t promise that future revisions of specifications covered by the Open Specification Promise will be automatically covered unless Microsoft is involved in developing them, that it’s not safe to rely on the OSP for current versions either. This is of course false, as the OSP is an irrevocable promise that Microsoft will never sue anyone for using any of the covered specifications (unless they sue Microsoft for using the same specification, which is a normal exception in all such non-assertion covenants).
Clearly the point is not that current specifications might stop being covered. The problem is that if future versions of the specs are not covered then it will become irrelevant that current ones are - there’s no point in implementing a standard that no-one uses anymore. And given Microsoft’s track record in the area of extending specifications until no-one can implement them, this seems like a very real risk.
He then points to a response in the context of OOXML that is supposed to further reassure. But it does the opposite
This section points out that the OSP only applies to listed versions of covered specifications. True, except that we have already committed to extending it to ISO/IEC DIS 29500 when it is approved in our filing with ISO/IEC. For ODF, IBM in their ISP takes the identical approach. Strange how things that seem appropriate for ODF are not appropriate for Open XML.
In other words, Microsoft can do exactly what I am concerned about, except that in the case of OOXML (which I really don’t care about) they’ve promised not to. Nice for word processors, perhaps. Not so nice for security people, who are covered by no such promise.
OSP covers specifications not code
Not true. The OSP is a promise to not assert patents that are necessarily infringed by implementations of covered specifications. Like all similar patent non-asserts (including the Sun and IBM versions for ODF) the promise covers that part of a product that implements that specification (and not other parts that have nothing to do with the specification). While the Sun covenant is silent about conformance to the specification, the OSP allows implementers the freedom to implement any (or all) parts of a covered specification and to the extent they do implement those portions (also known as conform to those parts) they are covered by the promise for those parts. Contrast that to the IBM pledge that requires total conformance and so programming errors or absence of something required by the spec (but not by an implementer’s product) means that the promise is totally void for that product.
I just don’t get this. It starts with “not true” but then goes on to confirm that it is true! That is, the code is only usable insofar as it is used to implement a covered specification. Programming errors would void their promise just like it does IBM’s. And similarly, re-use of the code for a different purpose would also not be covered. In other words, the specification is covered, not the code.
If Microsoft can’t coherently defend themselves against this analysis but still want us to believe it is incorrect, that seems cause for concern, don’t you think?
Ben Laurie - Interoperability
Despite Kim’s promise in his blog
That doesn’t mean it is trivial to figure out the best legal mecahnisms for making the intellectual property and even the code available to the ecosystem. Lawyers are needed, and it takes a while. But I can guarantee everyone that I have zero intention of hoarding Minimal Disclosure Tokens or turning U-Prove into a proprietary Microsoft technology silo.
Like, it’s 2008, right? Give me a break, guys!
I’ve now heard through several different channels that Microsoft want to “ensure interoperability”. Well. Interoperability with what, I ask? In order for things to be interoperable, they must adhere to a standard. And for Microsoft to ensure interoperability, they have to both licence the intellectual property such that it can only be used in conformance to that standard and they have to control the standard.
I don’t know about you, but that sure sounds like a “proprietary Microsoft technology silo” to me.
Ben Laurie - Pork Chops in Cider
This recipe was originally cooked for me by my friend Honk. Well, something somewhat like it was.
pork chops
garlic
mixed herbs
onions
olive oil
butter
mushrooms
more garlic
cider
cooking apples
cream
Smear the pork chops with crushed garlic, mixed herbs, salt and pepper. I’m lazy, so I only do one side. Grill them so they’re nicely browned on each side but not cooked all the way through. Starting with them slightly frozen might help.
Meanwhile, roughly chop the onions and cook them in olive oil until sweet. Add butter, sliced mushrooms, more crushed garlic, salt and pepper. Be generous with the butter. Cook the mushrooms very gently until they’re soft and laden with butter. Add cider, quarter of a can at a time, reducing it to almost nothing each time. Somewhere in there, add sliced cooking apple. When the apple is nearly soft and you still have some cider left, add the grilled pork chops to this mess. Cook them in the final reduction of cider. Once that’s done, turn off the heat and add cream.
Eat. And feel fat.
For guidance for 4 pork chops I used four flat mushrooms, three 750 cl cans of cider and two cooking apples. Cream I leave to your conscience. Obviously the chops need some fat on them or this will be totally boring. In fact, I lust after the chops I once saw in The Ginger Pig which were over half fat. Yum.
Sauteed potatoes are good with this. And peas if you’re lazy. Spinach if less so.
Ben Laurie - Microsoft’s Open Specification Promise
The Software Freedom Law Centre has published an analysis of the OSP. I don’t really care whether the OSP is compatible with the GPL, but their other points are a concern for everyone relying on the OSP, whether they write free software or not.
Ben Laurie - Bad Phorm?
As anyone even half-awake knows, there has been a storm of protest over Phorm. I won’t reiterate the basic arguments, but I am intrigued by a couple of inconsistencies and/or misleading statements I’m spotting from Phorm’s techies.
In an interview in The Register, Phorm’s “top boffin” Marc Burgess says
What the profiler does is it first cleans the data. It’s looking at two sets of information: the information in the request that’s sent to the website and then information in the page that comes back.
From the request it pulls out the URL, and if that URL is a well known search engine such as Google or Yahoo! it’ll also look for the search terms that are in the request.
And then from the information returned by the website, the profiler looks at the content. The first thing it does is it ignores several classes of information that could potentially be sensitive. So there’s no form fields, no numbers, no email addresses (that is something containing an “@”) and anything containing a title like Mr or Mrs.
he says “there’s no form fields”. But this is in the response from the webserver. Form fields in the request sent to the webserver are fair game, it seems. In other words, Phorm are quite happy to spy on what you type, but will ignore form fields sent to you by the server - well, that’s big of them: those fields are usually empty. It’s interesting that many people have picked this up as a contradiction (that is, how can there be no form fields if you are looking at search terms, which are entered into a form field?) - but it has been carefully worded so that it is not contradictory, just easy to misinterpret.
Phorm can completely adhere to this public statement and yet still look at everything you typed. Note also that they only talk about filtering senstive data in the response and not in the request. So nothing, it seems, is really sacred.
Incidentally, they are silent about what they do with the body of the request (usually when you submit a form, the fields end up in the body rather than the URL). That fills me with curiosity.
Even ORG swallow this bit of propaganda (from ORG’s post)
Phorm assigns a user’s browser a unique identifying number, which, it is claimed, nobody can associate with your IP address, not even your ISP.
Of course, this is nonsense. The ISP can easily associate the identifying number with your IP address - all they have to do is watch the traffic and see which IP address sends the cookie with the Phorm ID in it. In fact, they could probably use the Phorm box for this, since it already sees all the data.
and Phorm’s CEO, Kent Ertegrul, again in the interview with The Register
It’s important to understand the distinction between actually recording stuff and concluding stuff. All of our systems sit inside BT’s network. Phorm has no way of going into the system and querying “what was cookie 1000062 doing?”. And even if we did we have no way of knowing who 1000062 was. And even if we did all we could pull out of it is product categories. There’s just no way of understanding where you’ve been, what you’ve done, what you’ve searched for.
They say this, but we have to take their word for it. Obviously the fact it sits inside BT’s network is no barrier to them connecting to it. Clearly they could just look at the traffic traversing the system and know exactly what cookie 1000062 is doing. And which IP address is doing it, which doesn’t tell you who is doing it, but certainly narrows it down. Analysis of the data will almost certainly allow identification of the individual concerned, of course.
Not, of course, that taking people’s word for their privacy practices is unacceptable - it is pretty much unavoidable. What I object to is Phorm’s attempts to convince us that it is impossible for them to misbehave. Of course, it is not.
Now let’s take a look at BT’s FAQ
Is my data still viewed when I am not participating?
No, when you don’t participate or switch the system off — it’s off. 100%. No browsing data whatsoever is looked at or processed by BT Webwise. . We should be clear: the Phorm servers are located in BT’s network and browsing data is not transmitted outside. Even if you are opted out, websites will still show you ads (as they do now) but these will not be ads from the Phorm service and they will not be more relevant to your browsing. In addition, you will also not get extra protection from fraudulent websites.
This is just obviously a lie. Since opt-out is controlled by a cookie, the system must look at your browsing data in order to determine whether you have the opt-out cookie or not. Naughty BT.
Furthermore, it is difficult to imagine how they could architect a system where your data did not traverse some box doing interception, though it may, of course, decide not to look at that data. But once more we’d have to take their word for it. How can we ever be sure they are not? Only by having our data not go to the box at all.
Talk Talk say they are going to architect their system in this way, somewhere in the comments on this post. I await details with interest - I can’t see how they can do it, except by either pushing the traffic through some other interception box, which doesn’t really change the situation at all, or by choosing whether to send to the Phorm box on the basis of IP address - which does not identify the user, so, for example, I could find myself opted-in by my children, without my knowledge!
All these worries apply to the system working as intended. What would happen if the Phorm box got pwned, I dread to think. I hope they’ve done their homework on hardening it! Of course, since they have “no access to the system”, it’ll be interesting to see how they plan to keep it up-to-date as attacks against it evolve.
Ben Laurie - RFC 5155
After nearly 4 years of mind-bending minutiae of DNS (who would’ve thought it could be so complicated?), political wrangling and the able assistance of many members of the DNSSEC Working Group, particularly my co-authors, Roy Arends, Geoff Sisson and David Blacka, the Internet Draft I started in April 2004, “DNSSEC NSEC2 Owner and RDATA Format (or; avoiding zone traversal using NSEC)” now known as “DNS Security (DNSSEC) Hashed Authenticated Denial of Existence” has become RFC 5155. Not my first RFC, but my first Standards Track RFC. So proud!
Matasano Chargen explain why this RFC is needed, complete with pretty pictures. They don’t say why its complicated, though. The central problem is that although we all think of DNS as a layered system neatly corresponding to the dots in the name, it isn’t.
So, you might like to think, and it is often explained this way, that when I look up a.b.example.com I first ask the servers for . who the nameserver for com is. Then I ask the com nameservers where the nameservers for example.com is, who I then ask for the nameservers for b.example.com and finally ask them for the address of a.b.example.com.
But it isn’t as easy as that. In fact, the example.com zone can contain an entry a.b.example.com without delegating b.example.com. This makes proving the non-existence of a name by showing the surrounding pair rather more challenging. The non-cryptographic version (NSEC) solved it by cunningly ordering the names so that names that were “lower” in the tree came immediately after their parents. Like this:
a.example.com
b.example.com
a.b.example.com
g.example.com
z.example.com
So, proving that, say, d.example.com doesn’t exist means showing the pair (a.b.example.com, g.example.com). Note that this pair does not prove the nonexistence of b.example.com as you might expect from a simple lexical ordering. Unfortunately, once you’ve hashed a name, you’ve lost information about how many components there were in the name and so forth, so this cunning trick doesn’t work for NSEC3.
It turns out that in general, to prove the nonexistence of a name using NSEC you have to show at most two records, one to prove the name itself doesn’t exist, and the other to show that you didn’t delegate some parent of it. Often the same record can do both.
In NSEC3, it turns out, you have to show at most three records. And if you can understand why, then you understand DNS better than almost anyone else on the planet.
Ben Laurie - Microsoft Buys Credentica
Kim and Stefan blog about Microsoft’s acquisition of Stefan’s selective disclosure patents and technologies, which I’ve blogged about many times before.
This is potentially great news, especially if one interprets Kim’s
Our goal is that Minimal Disclosure Tokens will become base features of identity platforms and products, leading to the safest possible intenet. I don’t think the point here is ultimately to make a dollar. It’s about building a system of identity that can withstand the ravages that the Internet will unleash.
in the most positive way. Unfortunately, comments such as this from Stefan
Microsoft plans to integrate the technology into Windows Communication Foundation and Windows Cardspace.
and this from Microsoft’s Privacy folk
When this technology is broadly available in Microsoft products (such as Windows Communication Foundation and Windows Cardspace), enterprises, governments, and consumers all stand to benefit from the enhanced security and privacy that it will enable.
sound more like the Microsoft we know and love.
I await developments with interest.
Ben Laurie - Users Are Stupid…
… and they won’t take sensible security decisions, so we have to dumb everything down for them. Or at least, that’s what we whine. So, I have to ask, WTF is this all about?

How is anyone supposed to figure out what to do now? Surely we know which of those three errors it was? So why are we giving the user such appallingly crap feeback?
And I can totally imagine how the conversation with Majestic’s webmaster is going to go…
Me: Hey, I got this error. Apparently you’re either using a CA that’s not in a list I’m not going to give to you, or you screwed up your server config so your certificate is incomplete, or you are a phisher and I’ve just given you my password, or you aren’t a phisher but your cert and website don’t match. Please fix it.
Webmaster: O RLY? Well, I’ve got this email that says your wine order is either in the post, not in the post, cancel