Simple Comments and OpenID (3/5) | WebReference

Simple Comments and OpenID (3/5)

To page 1To page 2current pageTo page 4To page 5
[previous] [next]

Simple Comments and OpenID

OpenID Associations

As noted on the first page of this article, RPs must communicate with OPs via both direct and indirect communications, but with some cleverly crafted communications between the two, the number of those communications in a typical OpenID authentication can often be reduced to a single message.

Signatures and Shared Secrets

Before we go there, let's take a moment to understand what the RP to OP communications (and vice versa) are used for. First, an RP will contact the OP indirectly ( as described earlier in this article), in order to ask them if this particular end user is indeed the owner of the identifier they are claiming. The OP's response to this question is also recieved indirectly (since we must somehow get the user's browser back to the RP's site). When this response is received, the RP will then contact the OP again (this time directly) to verify that the response they sent was in fact genuine; not a fabricated response from some middleman pretending to be the OP.

The primary means of validating the OPs response is by way of a signed digest, or signature of the data included within the message. The two supported signature methods in OpenID are HMAC_SHA1 and HMAC_SHA256; both of which require a known, shared secret in order to generate the appropriate signature.

If you're foggy on the notion of signatures in general and HMAC sigs in particular, an example should help to clear up both points. In Perl, I like using the Digest::SHA module to generate these types of signatures, as it's simple, straightforward and provides access to multiple different SHA algorithms. Study the following code:

#!/usr/bin/perl
use strict;
use warnings;
use Digest::SHA qw(hmac_sha256);
use MIME::Base64 qw(encode_base64 decode_base64);
my $test_data = 'My string of response data.';
my $secret    = 'foobarbaz';
# First, let's generate a signature of the
# testdata, using our defined "secret" key
my $sig = encode_base64(hmac_sha256($test_data, $secret), '');
# $sig = OlHWnDPvJxt1PlPKN0Az2mYkA2X9DlJgrq7pPTHwQBY=

Now, someone who receives this data and knows the shared secret can verify that information as follows:

#!/usr/bin/perl
use strict;
use warnings;
use Digest::SHA qw(hmac_sha256);
use MIME::Base64 qw(encode_base64 decode_base64);
# Our shared secret
my $secret = 'foobarbaz';
# Retrieve the passed in data, and the signature
my ($test_data, $sig) = @ARGV;
# Generate our OWN sig, based on the shared secret
my $test_sig = encode_base64(hmac_sha256($test_data, $secret), '');
print $test_sig, "\n";
# Now, do we match?
if ($sig eq $test_sig) {
   print 'Data valid.', "\n";
}
else {
   print 'Data is invalid.', "\n";
}

It's critical, of course, that the shared secret isn't known outside of the two parties that will use it to verify their communications. So long as it's a secret, then the two parties can check each other's data to ensure that it really was sent from the other.

The following live example verifies this point. Fill in the Test Data input box with anything you like, such as a simple phrase or sentence. Then click the Generate Digest button to generate a Base64 encoded string that represents the HMAC_SHA256 digest of that test data, with a secret key of our own choosing. Finally, click the Verify Digest button, which sends both the data and the digest back to the server, applying the same HMAC_SHA256 algorithm to the test data, compares the results to the digest you sent, and returns an answer indicating whether that digest is valid. The same secret key is used in both the generation and verification of the digest. Try this test a few times, both with accurate and altered (i.e., change the data and/or digest before you verify the data), and note that the data will only validate if you send the identical test data and the originally generated digest.

Note: A JavaScript enabled later model browser (Mozilla, IE6+, etc.) is required to use this example.

Test Data:
SHA Digest:
Result:
    

Signatures and shared secrets provide us with a means to verify that data that we thought was sent from someone we know really was sent from that somebody, and not someone pretending to be that somebody. So how does all this relate to OpenID? When an OP sends back their assertion to an authentication request (i.e., either confirming or denying that the user actually owns the identifier they're claiming), they'll include a signature field of the sent data which the RP can check to ensure the returned data is indeed a valid assertion from that OP.

Direct Verification in OpenID

In the most basic form of OP assertion validation, we send all the data we got from the OP--including the signature--back to the OP themselves for verification; the process is nearly identical to the simple example above, where we sent both the test data and the generated signature back to the server. This is the simplest means of checking the assertion: The OP simply reads in all the data, recomputes the digest from it, and compares the result to the provided signature. If it matches, a valid response is returned, just like our example above.

As described so far, this verification might not seem to prove much; akin to asking your boss for permission to take the rest of the day off and then, when they've already said yes, turning around and asking, "Did you really say yes?" Nonetheless, it's critical. Remember that the assertion that comes from the OP doesn't go directly to your script, but rather through a redirect of the end user's browser. For all we know, the assertion data we receive could have literally been typed in by the end user themselves, and not sent from the OP at all! Double checking the signature sent with the data via a direct communication with the OP provides us with the assurance we need that the message was legitimate (and consequently, the OP knows that it should process such verifications once, and once only to avoid potential replay attacks from bad-guys who are watching the network traffic).

Verification via Associations

The above verification method requires an additional direct communication between the RP and OP with each OpenID authentication. If the RP and OP have previously negotiated a shared secret, then the final verification of the OP's signature can be performed directly by the RP without requiring another trip to the OP.

How such a shared secret is negotiated depends on the capabilities of the RP and OP, and the type of connection between them. If the OP performs all their communications via an encrypted (i.e., SSL) connection, then it's acceptable to transfer the secret in the clear; since the encrypted connection prevents outside parties from "listening in" on the conversation. Even if the connection between the RP and OP isn't encrypted, there's still a way to share a secret such that bad-guy traffic sniffers won't know what it is.

. >

Enter Diffie-Hellman key exchange, which enables two parties to derive a common shared secret without actually passing the secret between them. Instead, each party generates a public and private key, which are random numbers related to a pre-agreed prime number and modulus, and share with one another their public keys (keeping their private keys safely tucked away in some safe corner of their own environments). Then, each side can arrive at a common shared secret by applying a specific calculation in turn to the combination of their partner's public key, their own private key, and the pre-agreed prime number. The result? A unique number--known only to the two parties--that can then be used as a key for HMAC signing (or any other cryptographic use, for that matter). It's all very mathematical.

It also requires operations on big integers. Really big integers. For example, here's the pre-agreed prime number used in all OpenID shared secret negotiations (it's line wrapped so that we don't force this page to be abnormally wide):

1551728981814736974712322577637155
3991572480196691540447970779531405
7629378541917580651227423698188993
7278161526466314385615958256881888
8995127215884267541995034125870655
6549803580104870537681476726513255
7470407658574792912915723345106432
4509471500722962109419434978392598
4760375594985848253359305585439638443

Core Perl, and indeed computers in general, aren't designed to handle numbers this large. Consider, for example, this trivial code:

#!/usr/bin/perl -T
use strict;
use warnings;
my $test_int = 12345678901234567;
my $new_int  = $test_int + 1;
print $test_int, "\n";  # 1.23456789012346e+16
print $new_int, "\n";   # 1.23456789012346e+16
if ($test_int == $new_int) {
   print "Oops....\n";  # Shouldn't get here, but does
}

Obviously, the two numbers shouldn't be equal--but Perl thinks they are and prints "Oops..." as a result. The root of the problem has to do with an inherent loss of precision on numbers with a a large number of digits, but a full discussion is outside of the scope of this article. For those interested in pursuing the topic in more detail, see perlfaq4 and perlnumber in your Perl documentation.

Fortunately for us, there's an easy answer, in the form of Perl's standard Math::BigInt module. Behind the scenes, Math::BigInt breaks down an integer string (like '12345678901234567') into multiple, smaller integer numbers (the kind that Perl can work with natively). When calculations and prints on the number are called for, it works independently on the smaller numbers to arrive at the correct conclusion, which it then pieces back together as a string for the printed result. With Math::BigInt, the above code example becomes:

#!/usr/bin/perl -T
use strict;
use warnings;
use Math::BigInt;
my $test_int = Math::BigInt->new('12345678901234567');
my $new_int  = $test_int + 1;
print $test_int, "\n";  # 12345678901234567
print $new_int, "\n";   # 12345678901234568
if ($test_int == $new_int) {
   print "Oops....\n";  # Doesn't get here
}

Big Number Performance

Armed with Math::BigInt, we can now perform calculations on large integers, the kind that will allow us to perform secret key exchanges as described in the section above, but unless you have the beefiest computers and/or an ample amount of patience, Math::BigInt alone isn't enough.

The problem is performance. To perform the kind of calculations we need for Diffie-Hellman key exchanges takes lots and lots of CPU juice. How much? Consider this calculation, which uses a large random number, expresses it as a power of 2 and then takes the modulus of the result divided by the prime number above (an actual calculation used in the key exchange described above). Even with Binary Exponentiation (as utilized by the Math::BigInt module), the result of this calculation takes over 13 seconds to complete on my test machine:

#!/usr/bin/perl -T
use strict;
use warnings;
use Math::BigInt;
my $p = Math::BigInt->new('1551728981814736974712322577637155' .
        '3991572480196691540447970779531405' .
        '7629378541917580651227423698188993' .
        '7278161526466314385615958256881888' .
        '8995127215884267541995034125870655' .
        '6549803580104870537681476726513255' .
        '7470407658574792912915723345106432' .
        '4509471500722962109419434978392598' .
        '4760375594985848253359305585439638443');
my $pk = Math::BigInt->new('1234567890123456789012345678901234' .
        '56789012345678901234567890');
my $g = Math::BigInt->new('2');
my $pow = $g->bmodpow($pk, $p);

Obviously, 13 seconds is way too long - it's likely most end users will abandon your form well before that. The low-level, heavily optimized GMP library cuts down the processing time dramatically; and, as luck would have it, Math::BigInt can be told to use that library, if it is available:

#!/usr/bin/perl -T
use strict;
use warnings;
use Math::BigInt lib => qw(GMP);
my $p = Math::BigInt->new('1551728981814736974712322577637155' .
        '3991572480196691540447970779531405' .
        '7629378541917580651227423698188993' .
        '7278161526466314385615958256881888' .
        '8995127215884267541995034125870655' .
        '6549803580104870537681476726513255' .
        '7470407658574792912915723345106432' .
        '4509471500722962109419434978392598' .
        '4760375594985848253359305585439638443');
my $pk = Math::BigInt->new('1234567890123456789012345678901234' .
        '56789012345678901234567890');
my $g = Math::BigInt->new('2');
my $pow = $g->bmodpow($pk, $p);

Now the calculation is performed in less than a second--back to a respectable and usable time for our needs.

OpenID Association Recap

I went through many different details on this page, so let's briefly recap what we've learned before we move on:

Association-based verifications in Simple Comments require several pieces to be in place on your server; among them, the Digest::SHA module, and, if the OP doesn't support SSL-protected communications, the Math::BigInt module and the GMP library. Some OPs may additionally require that your RP be able to produce HMAC_SHA256 digests, as opposed to the older (and weaker) HMAC_SHA1 digests. Simple Comments checks for each of these required components, and provides association-based OP verifications only if each of the necessary pieces is available. If not, it defaults to direct verifications--i.e., it asks the OP, via a separate, direct communication, if the data that we've received was really sent from them in the first place.

If you think it's shame to devise and implement such an elaborate method of ensuring the validity of a message between two parties solely for the purpose of authenticating an individual's claim to an identifier, you're not alone. On the next page we examine OpenID extensions, which allow RPs and OPs to communicate additional data in their exchanged messages beyond the simple confirmation or rejection of a user's identifier.


To page 1To page 2current pageTo page 4To page 5
[previous] [next]

Created: July 31, 2008
Revised: July 31, 2008

URL: http://webreference.com/programming/perl/comments/openid/3.html