Google+ Demographics in Google Analytics

There have been some interesting things happening with Google Analytics that have made it even more useful for developers recently. The Google Play team announced that you can now track app install referrals using Analytics, and on the web side the Analytics team have started rolling out Demographics and Interests information.

Demographic data is a great way of getting a view of how the usage of your web app varies with the age or gender of the person using it. This is powerful, but it got me thinking about how you might do the same thing within a mobile application. Some of the same information is actually available via a user's Google+ profile. By integrating Google Analytics into your application you can get measures of the usage of various parts of the app, and by grabbing the age range and gender from Google+ Sign-In you can look at how the different flows break down by the demographics of your users!

Setting up Google Analytics

This is done through use of a feature called Custom Dimensions that allows you to associate arbitrary data along with a trackable event. I'll quickly walk through how to add Google Analytics to an Android or iOS application that has Google+ Sign-In, and start tracking the age range and gender.

First thing to make sure you have a Google Analytics property. If you've got an existing web tracking number, then you can just create one - if not you'll have to setup a dummy web property first. Then create a mobile property from the Property drop down in the Admin section.

Once you've created it, the site will give you links to the Analytics SDK for Android and iOS - grab the one you want! Then, on the left under Custom Definitions > Custom Dimensions you can set up new dimensions for age range and gender. Note that analytics gives you the code snippets right away, so if creating your own dimensions definitely copy those! In this case I created them as user level dimensions, as they're associated with the all the events from the user who is browsing the app.

In the app

First you need to set up the basic configuration to log events. This means defining the analytics property ID you're using to track your app. In iOS its easiest to do this somewhere central, like the AppDelegate. In Android, we can define the details in an XML file, and reference from there.

In Android, you need to first add a permission (in addition to the INTERNET permission Google+ Sign-In already uses) in the AndroidManifest.xml, to allow the tracking library to determine the network state.

<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />

Next you need to include the library in the project. I'm using Android Studio, so I grabbed the libGoogleAnalyticsServices.jar file from the SDK zip and put it in the libs/ folder of my project (at the same level as the src/ directory). To include it I just added a dependency to my gradle config:


dependencies {
compile 'com.google.android.gms:play-services:3.2.+'
compile 'com.android.support:appcompat-v7:18.0.0'
compile files('libs/libGoogleAnalyticsServices.jar')
}

Next you need to define the property ID and basic configuration for Analytics. In res/values add a new file, analytics.xml, that includes the tracking ID and sets whether automatic activity tracking and exception tracking should be enabled:

On iOS you need to extract the header files and the libs from the zip file and add those to your project. You also need to add a couple of frameworks on top of those included for Google+ Sign-In: the libGoogleAnalyticsServices itself, AdSupport, CoreData and libz.dylib.

To configure the library, you can use the singleton [GAI sharedInstance] - so you'll need to #import "GAI.h". Then it is just a case of setting the same kind of options as you did on Android:

Tracking Hits

Next, in each of your activities or controller you can log an event when the user accesses it. This gives a baseline page style metric of which parts of the app users are going to, though of course many more subtle events and interactions can be captured! On Android, it is best to use the EasyTracker class in the library to track your Activity views, by adding calls to our onStart and onStop. This will automatically reference the analytics.xml file you set up earlier:

On iOS, you have a couple of options: you can either track the view manually, or use the GAITrackedViewController as your parent class. In this case I chose the former. First thing is to define the screen name variable for the controller, then send the hit. Note that this snippet also request that the sign-in class automatically fetch the current user profile, so it'll be available when you want to log the hit with demographic information.

Demographic information

Finally, you can enhance the tracking with the demographic information. Once the user has signed in you have their profile easily available from PlusClient.getCurrentUser or [GPPSignIn sharedInstance].googlePlusUser. You can retrieve the ageRange fields and the gender field and log them as custom dimensions. Note that you actually track against the dimension index, so in my case age range is 1 and gender is 2. Make sure you're using the indices of the actual dimensions you want to track in your code!

On Android you can extract out the age range and concat it into a string to send. The gender flag is a int mapping to a constant, so map that to a string as well. I did this in the onConnected callback that fires whenever an authenticated connection to Google Play Services is made.

On iOS you can add the call to the authentication callback that is triggered after a manual sign in or a call to trySilentAuthentication. If you're not calling that on each controller, we could check the session and have similar code in a call if [[GPPSignIn sharedInstance] authentication] is set.

And that's it! The data should now start flowing into Google Analytics for custom reporting. You can read more about custom dimensions and metrics for iOS and Android on the Google Analytics site.

Read More

Crouching Sharebox, Hidden Android

With all the share box related things happening over the last week or so, I realised there are a couple of non-obvious parts of the Android PlusShare class which might be of interest to people.

Google Play Services Share box

I actually had to test for myself to verify this feature of Google Play Services, which I was aware of but hadn't actually tried! If you are building an app which include Google+ sharing, you might be used to it firing an intent to the Google+ app which gives you the familiar Google+ sharing experience.

Because the Google+ app comes with many devices, you may not have worried too much about what would happen if it wasn't available, but the Google+ Android platform team did. So they actually built a backup share box right into Google Play Services. You call it just like you would with the regular share box - if Google+ is installed it will be used, otherwise the Google Play Services share box will be handle the intent instead.

Its a little bit simpler than the primary share box, but does the job and allows users to share to their Google+ circles even when they don't have the Google+ app installed on their device. Note that the Google Play Services share box only handles the "text/plain" type - so you can't use it to share videos or images!

PlusShare createPerson

One of the tips in Gus and Joanna's Google I/O best practices talk was to supply some recipients when creating a share, in order to encourage targeted sharing. There have been some really interesting uses of that: for example, Mashable use it on the web with interactive posts to encourage conversation by giving you a "discuss" button, then adding the user that shared with you as a recipient.


If you want to do something like this then you may know the name and ID of the user, but not have an Android Person object for them because they're not in the user's circles. Luckily, PlusShare has a helper method that makes creating a user for this purpose easy. If I wanted to fill in Silvano as a sharing suggestion for users of my app, I could do so like this:

Read More

Calling Google apps on iOS & X-Callback-Url

I previously mentioned how easy it is to deep link straight into the Google+ app on iOS, to view a user's profile or similar, using the gplus:// scheme and UIApplication openURL. This sort of integration is pretty much the only way to do general inter-app communication in iOS (bar the new audio APIs), and several other Google apps offer their own URL schemes which let you make very straightforward integrations.

Mostly these are ways of giving users a slightly smoother experience when entering the app, but one allows you to do a little bit more!

Google Maps

With Google Maps, you can specify coordinates to focus on, a search to perform, or two locations to ask for directions between. All the parameters are documented on the maps site. As a simple example, searching for pizza near me looks like this:

Gmail

With Gmail you can go straight to the compose screen and specify the to and the subject line, just like with web mailto:// addresses. This can be good for email options in case users don't have mail.app set up on their phone.

Open In Chrome

Chrome on iOS has a different option though. It allows you to not only open a page within Chrome, but also to have a return button in the Chrome nav bar so users can come back to your app afterwards. Why would you do this instead of popping up a UIWebView? Mainly because sending the user to their regular browser means they're much more likely to be signed in to any services on the page - for example for social share or endorsement buttons (like the +1 button). It's also a great addition to a UIActivityView, where users can bookmark easily. The fact that the user gets a button pointing back to you app makes it a lot less risky that they'll go off and do something rather than continuing their experience with you.

This is all implemented via a custom URL scheme of googlechrome-x-callback://, which you can test for with canOpenURL, and call with openURL. However, there is a controller on Github which handles everything for you - for Google+ developers: this is actually included in the GoogleOpenSource.framework. Opening a link in Chrome just requires that you have defined a custom URL type for your own app, and can pass it to the OpenInChromeController. In this example my custom URL type is myuri:// and I am asking Chrome to open this site:

When I tap the GoogleMe nav item, I'm returned straight to my app. This happens because Chrome, along with other popular apps like Instapaper, implements an open scheme called x-callback-url, and can invoke my custom URL when the user presses the button.

That's pretty much all you need for Open In Chrome, but the x-callback-url scheme is something you could support in your own application. It just needs a custom URL type in your app, and to be able to handle URLs of the format: yourscheme://x-callback-url/[action]?param=val&param2=val2. There are some pre-defined parameters, such as:

x-source: the name of the calling app. In the Chrome example above, my app was called GoogleMe.
x-success: the callback URL. In the example above was myuri://home.
x-error and x-cancel: for indicating other results

On top of that can be other parameters, such as Chrome's url parameter that takes the site you would like to show to the user.

Lets say that we had an app which allowed us to subscribe to an RSS feed. The x-callback-url format we define might look like this:

myreader://x-callback-url/subscribe?x-source=myapp&x-success=myuri%3a%2f%2f&feed=foo

To parse that in iOS we'd need to implement the application:openURL:sourceAppliction:annotation: method on our app delegate, and parse the incoming structure. There are some libraries already built to do this, but for educational purposes the steps are fairly simple: take the URL, check the parameters are as you expect, and parse out the ones you need. At the end, callback to the proper URL with openURL.

This type of inter-app communication really opens up new possibilities for building richer iOS experiences - and can be a nice way of adding the functionality from an existing service into your own app.

Read More

Attaching images to Google+ shares on Android

After the recent iOS 1.4.0 SDK update that allowed sharing images straight to Google+ from within an application, a couple of people asked whether it was possible to do the same thing on Android. Luckily, it is, and it is very straightforward.

The PlusShare.Builder class has an addStream method which you can use to attach an image when configuring a share. Note that just like on iOS, you can't use it while attaching a URL, or as part of an interactive post. The method takes a URI, and will add it to the intent as a standard EXTRA_STREAM. There's also setStream, which overrides any previously attached streams.

All we do in the snippet below is trigger a picker intent to allow the user to choose an image from the gallery. We then create a share builder and add the returned URL. Note that this usage is not passing a PlusClient when creating the Builder, so it will work for users that aren't signed in with Google+. If your users are, then definitely use the version that accepts a PlusClient!

Obligatory Silvano screenshots:

Read More

Google+ iOS SDK 1.4.0 with native sharing

I'm really happy to see the release today of version 1.4.0 of the iOS Google+ SDK. It's worth a look, in no small part because the team have fixed one of the longest standing requests for the SDK: the ability to share to Google+ without the user leaving the app. In previous versions of the SDK, the only sharing option was via the mobile browser, so the user would be switched out of then back into the app they were using. This release puts the share box fully inside the app for signed in users, and has also given it a bunch of new special powers for better sharing. All of the documentation on the Google+ iOS developer site has been revised and improved, but I wanted to highlight my favourite parts of the release here as well.

Native Sharebox

The new share box is a smoother way for users to share from within your application to Google+. The existing share box that calls out to the browser and back is still available for anonymous users, but if you're implementing Google+ Sign-In, go native share box all the way! Both implement the GPPShareBuilder interface, so your code (generally) doesn't have to care.

The change is really extremely easy - drop in the new SDK which you can download LINK from the Google+ developers site, and replace your current call to the shareDialog function with nativeShareDialog.

There are some new required packages to include in your linker settings as well. The complete required set is:

  • AssetsLibrary.framework
  • Foundation.framework
  • CoreLocation.framework
  • CoreMotion.framework
  • CoreGraphics.framework
  • CoreText.framework
  • MediaPlayer.framework
  • SystemConfiguration.framework
  • UIKit.framework

You'll of course also need to include the Google+ SDK files themselves!

  • GoogleOpenSource.framework (or individual open source files)
  • GooglePlus.framework
  • GooglePlus.bundle


Here's what it look in xcode.

Make sure you include the bundle as well! Else it might fail to find the font and give you an error like: "*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Google Plus fallback bundle could not be found. Please include it in the project.'". You can add it in build phases "copy bundle resources" in Xcode:

Upload images and videos

The new share box includes the ability to directly attach videos and images into posts, so they are displayed as if they had been shared directly onto Google+. The video call takes a local URL, and the image a UIImage. Note that both of these are only available with the native share box, so you will have to cast your GPPShareBuilder to a GPPNativeShareBuilder to call them.

For example, to attach an image to a post:

New Error Callback

While the old finishedWithShare handler is still available, there's a new callback available in GPPShareDelegate that gives more details on errors that happen. Rather than passing in a BOOL, it follows the common model of passing an NSError that is set to nil on success. The new, replacement method is finishedSharingWithError:error. This allows you to detect the share succeeding, failing, or being cancelled:

It's also worth noting that there are some conditions under which the native sharebox wont open, for example if you try to attach both a URL and an image. You should check the return from the open call. If it's No, the sharebox is probably misconfigured, and an error will be logged to the console.

Note that along with this, you also have an ability to close any open share dialogs - this can be helpful if an important event happens in your app, and you need to bring the user right back. As an example, the below code just arbitrarily closes the share box after 5 seconds, which is a feature you might want to implement if you particularly dislike your users.

Prefill Recipients

The native share box also adds the ability to prefil recipients, so you can target posts to specific people. This is generally for no more than about 10 recipients, and you target them with an array of IDs, such as could be retrieved from the retrieve people call.

ID token

One of the other complaints developers have had is that its hard to authenticate an iOS client to a server with the current SDK. With this release, full ID token support is included which allows authenticating a backend, without giving it access to call Google APIs. Tim Bray's excellent blog post from the Android blog shows how to verify back ends for Android clients - now with this release you can share that approach between Android, the web, and iOS.

It's actually rather straightforward to get the token once the user has signed-in.

https://gist.github.com/ianbarber/6813428

The token itself looks like this. It has three dot-separated (I've tried to highlight the dots a bit!) base64 encoded sections which contain a header, a JSON blob, and a signature that allows you to cryptographically verify it was created by Google.

eyJhbGciOiJSUzI1NiIsImtpZCI6ImRjZjY2NGI3YjkyOWRmOWU3NDFmNWZhNGNjNzQyYTg3MTRhZWFiNDcifQ.eyJpc3MiOiJhY2NvdW50cy5nb29nbGUuY29tIiwic3ViIjoiMTEzMzQwMDkwOTExOTU0NzQxNjk2IiwiYXVkIjoiMzY2NjY3MTkxNzMwLWo0ZnU4cG52YzJqMXR0a3JsMWs3anU1bjVwaW0zdmV0LmFwcHMuZ29vZ2xldXNlcmNvbnRlbnQuY29tIiwiYXpwIjoiMzY2NjY3MTkxNzMwLWo0ZnU4cG52YzJqMXR0a3JsMWs3anU1bjVwaW0zdmV0LmFwcHMuZ29vZ2xldXNlcmNvbnRlbnQuY29tIiwiYXRfaGFzaCI6IlZpcWdrOTlhT21DUW5xREh6WFo2MVEiLCJpYXQiOjEzODA3OTg3MTEsImV4cCI6MTM4MDgwMjYxMX0.MMkyQ-O5F8HlLW2bRwpf4xMWR67slIm5Cu8PAdoNGPbcGZ562sJ03wq4foy5d2zSW6BGZWWkPOwNvFq-NF93fMQeR2QBISF2l_BZ0Laxiufn8vXaoqWwRdD8XEfayF0m-Wx5JlkWgkmFPbYfriaAU7vfKGltHy7Cg42srDgTbEM

Decoding the body lets us see the application that was signed in to and the user ID. If we had requested the users's email address, that would be in here as well. This allows you to authenticate the user on your application servers without having to make any other API calls, which is pretty cool!

{
"issuer": "accounts.google.com",
"issued_to": "366667191730-j4fu8pnvc2j1ttkrl1k7ju5n5pim3vet.apps.googleusercontent.com",
"audience": "366667191730-j4fu8pnvc2j1ttkrl1k7ju5n5pim3vet.apps.googleusercontent.com",
"user_id": "113340090911954741696",
"expires_in": 3510,
"issued_at": 1380798711
}

Some Other Helpful Things

The UIActivityView is a really nice way to integrate sharing, and you'll find in the sample app a helpful example custom UIActivity and share icon you can use in your own applications - it's ShareActivity in the GooglePlusSample, and ShareSheetMask in the resources folder.

There have also recently been several improvements to the Google+ iOS app (and the mobile browser sign-in interface) which may benefit your users. For example, you may notice that when signing in you now have the option to choose an account or register a new one, to make life easier for users with multiple accounts. If you want to offer a switch in your own app, the easier way is just to call [[GPPSignIn sharedInstance] signOut] to clear the local state, then [[GPPSignIn sharedInstance] authenticate] to sign in again.

Special thanks from me to +Banjo, +musiXmatch and +allthecooks for trying out the SDK and sharing useful feedback pre-release: go check out their apps (all of which will be featuring the new sharebox soon!)

Read More

Google+ client changes in Google Play Services 3.2

The latest update to the Google Play Services client library on the 20th of September had a couple of Google+ changes in it that the docs were a bit slow to reflec. Though they're now all up to date, I wanted to quickly highlight the changes in a post as well, for anyone looking at the client library.

PlusOneButton no longer requires PlusClient

No longer do you have to worry about calling clearScopes() on your PlusClient before creating a +1 button in an Android app. The PlusClient argument has been removed from the initialize methods on the PlusOneButton class, which means creating a +1 button is now as easy as adding it to your layout file:

We set up the button by calling the initialize function with a URL, and either a request code or a click listener.

loadPerson is dead, long live getCurrentPerson

The people loading functions were simplified as well. The PlusClient.OnPersonLoadedListener interface and the PlusClient.loadPerson method have both been removed. For loading people, the PlusClient.OnPeopleLoadedListener interface and the PlusClient.loadPeople methods are still there, and allow you to retrieve one or more people by user ID, as described in the loadPeople docs. However, for the common case of getting details about the signed-in user, the best idea is to use getCurrentPerson. This gets populated as part of the sign-in process, so should be the quickest and simplest way of getting the user's profile:

Thanks to +Gerwin Sturm and +Stephan Linzner for the pings about the documentation, and to +Brett Johnson for getting it updated super quickly!

Read More

Device Sign-In With Google

At my talk at Over The Air 2013 this weekend in the wonderful Bletchley Park, one thing that surprised some people was the fact that Google has a OAuth 2.0 option for low capability devices. This is one of the big benefits of using Google as an IDP - it allows you to take advantage off all the work that the identity and security teams do in areas like 2-factor auth, data management, and access for all sorts of different environments.

The devices setup is really for cases where you want to allow a user to sign in to something that doesn't have a great control setup - for example a TV or a wifi-enabled toaster. With it, the user only needs to indicate that they want to sign in, but they actually authenticate in a web browser on a regular PC or their mobile device.

You can take a look at how it works below. This iframe is representing a device that you might want to sign in to, and as soon as you click the Sign In button below, it'll try to sign in and give you a code. You'll need to enter the code into the device log in page at google.com/device, and approve the permissions. Then, the iframe should sign-in automatically.

This is the flow we're going through:

On the device, we first request a device sign-in with the scopes we want to access, and a client ID to identify the project. Note that the client ID must have been created in the API console as for devices. You can't use the same client ID as for a site or a mobile app, though they can of course be in the same project.

Making the call looks like this (in PHP, the natural language of wifi enabled toasters):

In the response we get a code for the device, a code for the user, and an interval (in seconds) that we can poll with to see whether the user has granted us permission. As you can see, the device code looks much like the code we'd get for a regular OAuth 2.0 flow, while the user code is a bit simpler!


{
"device_code" : "4/pH76g9gRK_r0_M8nFBp8ru0DikyU",
"user_code" : "tgm224xy",
"verification_url" : "http://www.google.com/device",
"expires_in" : 1800,
"interval" : 5
}

At this point we need to display the user_code to the user, and we can set up a loop to check for the token. The check is to the regular OAuth2 token endpoint, the same we'd use for a code or refresh token based call to get an access token. In the little app above there is a loop in Javascript to actually do the polling, but the check is made from the server side. This is because the client secret is needed, which naturally has to be kept on the server only. We have the sign-in button in the example above to avoid spamming the auth servers with this code check, as there really is no user interaction needed with the device.

This isn't very much different from a regular code exchange, other than the different grant_type. In the demo we just parse the response, extract the token, and retrieve the user's public Google+ profile to get their name. The response we get back includes the access token and a refresh token, so we wouldn't have to ask the user to sign-in again after an hour. In the example app we just throw the credentials away.

At this point, we can call APIs and do all of the stuff that we need in order to enable printing your recent Google+ posts onto toast, or some other eminently Kickstarterable reason. Mainly though, we have an identity which can be shared with a signed-in user in a web or mobile app, meaning it's easier to create a good experience for a user from app to device and vice versa.

Read More

Using a gamepad in Android

I've recently had the pleasure of messing about with some of the codelabs created by the Google Play Games team to demonstrate their functionality. At the same time, I also got my hands on a rather fun Moga Pro game controller, which can connect to Android phones and tablets via Bluetooth. Getting the two to work together was actually very easy, so I just wanted to note down a quick example for anyone trying to do something similar!

When it comes to the gamepad input itself, there are a couple of options. Moga, along with other controller suppliers, have their own SDK to download and integrate. However, to get something going you don't really need that at all - the standard human interface device functionality in Android SDK level 12 and above includes support for gamepads and game controllers.

From this point of view a gamepad is just a series of keys and joysticks. By default, once a pad is paired with an Android device, interacting with it causes events to be sent to the application and view that have focus - so you can generally navigate around, press buttons and so forth. This focus is worth noting: if we want to receive events in a custom view (as used in a game), it will need focus. To do that we can add a <requestFocus /> tag in the layout to the view, and set setFocusable(true); in the View constructor.

To actually receive the events we just need to implement the appropriate functions. For the joystick it is onGenericMotionEvent.

When the joystick is moved this function is called with a MotionEvent, as with onTouch events. In the function we filter for joystick events and extract the positions on the X and Y axes. The value is between -1 and 1 in each direction, with the rest point at 0. This happens to fit pretty well with what we needed for my little dot view - the divide by 4 is just to limit the bounds the dot moves to.

Similarly to this, each button on the device will fire a KeyEvent to the onKeyDown function:

Here we return true if we consume the event, or pass it on to the super class for processing if not, so the pad could be used to interact with other views. In general though we'll want to suppress the events we're actually using so there aren't any unexpected interactions.

Each button is registered with a different keycode, so we can look for A and B for this little view. This is where it can get tricky with different gamepads though - not every device is necessarily going to report the same mappings keycodes for similarly placed buttons, so it may be worth testing, and perhaps supplying different "maps" for popular controllers when building a game.

Overall, very straightforward! There's a gist of the complete view these snippets were taken from as well!

Read More

Defining Constants in Objective-C

One of the nice things about working in developer relations is I get exposed to an awful lot of code written by an awful lot of different people, which means I regularly encounter techniques I haven't seen before. I wanted to quickly post on one that is possibly very familiar to iOS developers, but that I only saw recently - though I have to say that for readability's sake I would recommend some good commenting if you do decide to use it!

The code came from an iOS codelab for the excellent Google Play Games service, written by the mighty Kerp. The code was a header file doing something pretty normal: defining a list of constants for use in game. What was a little unusual in that it had the values both declared and defined in the header, and that there seemed to be some business with macros occurring.

This took me a little bit of parsing! In general the normal way of defining these sorts of constants in Objective-C is to have an extern'd definition in the header file, and the value in the implementation (.m) file, so:

This is assuming that you need the constant available throughout your app - if its just needed in one file you can declare it static in the .m:

It's also possible to use the preprocesser to #define a constant, but in general it's nicer to have a real value. This particularly true for strings, as it will mean all references to the constant will be to the exact same bit of memory, as opposed to several bits of memory that happen to have the same contents.

So, back to our mystery macros. The problem they are solving is around having both the declaration and the value in the same place. The .h file with the constants in is imported into all the .m files that need it. The #import directive in objective-c works very much the regular C include - in that it is replaced by the contents of the file it references. However, #import will only do this once per compilation unit - effectively per .m - so if you #import files a and b, and b imports a again, you wont run into trouble.

That said, each compilation unit does the inclusion separately, and here is where the macros kick in. If I include my constants file in, say, a ViewController, then the macros at the top are processed:

  • We check to see if APP_DEFINE_CONSTANTS is defined (it isn't)
  • We set EXTERN to "extern"
  • We set INITIALIZE_AS to nothing
  • We include the line in our file: extern const double kAppFirstConstant

Exactly as we were using in the regular .m and .h separation! Now, as you may have guessed we can look at the Constants.m file, which just contains:

When we compile that unit, the same code will get inlined, but this time:

  • We check to see if APP_DEFINE_CONSTANTS is defined (it is)
  • We set EXTERN to nothing
  • We set INITIALIZE_AS to "=x
  • We include the line in our file: const double kAppFirstConstant = 0.025

Exactly what we'd want in our .m file! So we have both declarations and values in one file. As an aside, if you're wondering why the constants start with k, it's because that's the Google C++ coding style suggests that, and is referenced in the Google Objective-C coding style. It presumably found its way into the guides as a vestigial bit of Hungarian Notation.

Read More

QUIC Notes: Stream multiplexing and congestion control

In a previous post I talked about some of the ideas that drove QUIC, but in this one I want to got a little more into how it actually works to get data from host to host. Again this is all from the notes I made looking through the source and spec - so always go to the code for the latest!

All networking is constructed as a series of layers - each layer defining its own mechanisms for reliability, name resolution and other fundamentals. Though QUIC is a single protocol, the layering is still present within it - an individual piece of app data (say a HTML page) will be encapsulated with several different layers of framing: the stream frame, the QUIC session, and a cryptographic channel.

This means any given QUIC packet can be peeled back. For example, a regular packet containing part of a web connection might look like: [[CRYPO HEADER] [ENCRYPTED DATA: [QUIC HEADER [STREAM FRAME HEADER [THE STUFF YOU ACTUALLY WANT]]]]]. While this looks like a lot of overhead, there are a few tricks that are used to minimise this, so in reality the headers will often be in the range of 20-30 bytes.

One of the fundamental decisions in QUIC is to multiplex streams over a single logical connection - much as SPDY multiplexes streams carrying web data across a single TCP connection. However, QUIC uses the connectionless UDP, so each packet is sent independently, without reference to some established connect. Therefore, the multiplexing needs to happen a little differently. QUIC creates what can be thought of as a cryptographically secured connection, or session, between a client and a server. Connection information and data packets are then sent without this logical pipe.

Every QUIC packet has some standard data included with it. An individual connection is identified by a GUID, a 64 byte pseudorandom number that is used to identify the connection. This is important, because the vagaries of UDP over the internet mean that the same connection between two parties may not consistently be on the same quad of (source, port, destination, port), which is how connections are normally identified. Each packet also contains a sequence number, which increments as each packet is sent (from either side), and it contains flags for whether the contents are FEC packets or data packets, and whether the connection is being reset. This forms the common QUIC packet header.


In case you wanted to see what my notes looked like

Both the GUID and sequence number can be sent in a reduced form - so if 1 byte is sufficient to identify the connection, or the number of the packet, we can set flags to indicate that, and save ourselves a bit of header overhead. There is a rough maximum packet size aimed for to avoid it being fragmented or split across multiple packets of the underlying communication layers (IP, ethernet, etc.). This is around 1500 bytes in most cases (though may vary when jumbo ethernet frames are present, or alternative lower layer protocols for the entire path).

Streams

Once the connection is established, either the client or the server can initiate a stream within it at any time. In most cases, one stream will be created at connection initialisation time by the requesting party (the client), to support the aims of 0-RTT where possible. In keeping with that goal, a stream is initiated simply by sending data with a 32bit stream ID which has not been used within the connection before.

To avoid collisions when creating new streams, streams initialised by the client start with an odd number and ones by the server with an even number. In either case, the numbers must always increase - it's fine for the server to initiate stream 6 followed by stream 10, but not for it then to go back and initiate stream 8.

There are quite a few frame types which are used for different parts of the flow, but for streams we just need the type to be set to 1 for STREAM_FRAME. Once the type is established as a STREAM_FRAME, that defines the structure of the payload: a 32 bit stream ID, a byte currently used for sending a FIN flag to close the connection, a 64 bit offset to represent the byte position in the underlying data being sent, then 16 bits that describes the length of the remaining data in bytes. Generally, a given QUIC packet will contain data from only one stream, though as an optimisation small frames from multiple streams may be combined - in that case each will have its own stream header.

Proof of Ownership

One of the interesting ideas built into the QUIC protocol is the idea of proof. There are two areas where this applies: proof of ownership of an address, and proof of data received. Both of these are addressing specific threats that could be used for denial of service type attacks.

Proof of ownerships is a way for a QUIC server to validate that the client it is communicating with is really on the IP address that it appears to be - it's easy to spoof a message from a certain IP, but harder to receive data sent to it without having a man-in-the-middle. To do this the server generates a hard-to-guess proof value, and sends it as part of the crypto packet framing along with some data. Receiving a valid value back from the IP indicates that the sender really is listening on that address. This is effectively how an unpredictable sequence number in TCP works, so is a familiar protection.

The other area of proof is in terms of received data. Some attacks involve tricking a server into resending data it has already sent, which often is significantly larger than the request to resend - so the server can be used to amplify a attack by causing a lot of extra traffic to be sent to a given host. So servers (though this works both way, its mostly a concern for servers) require proof that the user has received the data that has been sent thus far. To do this a specific bit is flipped randomly in the sequence number, and a flag is set in the QUIC packet header to indicate so. The receiver tracks a cumulative hash of the entropy seen so far, and sends that with ACK messages to prove it has really received the previously sent data.

Reliability

This leads on naturally to reliability. The first line of defence against packet loss is the Forward Error Correction mentioned in the last post, but QUIC also includes frame types for dealing with resending data, referred to as ACK frames (though they often represent a negative acknowledgement). The ACK frame in QUIC actually cleverly contains a lot of status information to allow the other side of the connection to have a clear understanding of the state. Unlike in TCP, ACKs don't need to be sent all the time, but only as needed based on the congestion control rules.

As with all data frames, it starts with a type byte which in this case set to 2 for ACK frame. Then the remainder of the frame is split into two sections:

The first block shares information about sent data from this participant. This is a bit unusual! It means that in a client-server situation when the client acknowledges information it has received, it simultaneously communicates information about what has been sent to the server. This means that any ACK in either direction effectively updates the connection status between for both hosts. This is clever, and gives things like congestion control algorithms a lot more data to work with.

The block is structured as a byte for the hash of the entropy bits sent so far, to prove it has sent them. Next it the sequence number of the the first frame which has been sent, but has not yet been acknowledged.

The second block is structured the same way, but for data that has been received. The first byte is the entropy of frames received this far, and a sequence number of the highest (contiguous) packet received - so if 3, 4, 5, 7, 8 have been received, it would be 5. However, if there are less than 256 missing packets, the sequence number can instead be the highest received: in the previous case, 7. Then, the next byte is the number of missing packets followed by the individual sequence numbers of them. So, in this case the data would be [1] packet missing, [6] sequence number of missing packet.

This is a very cool optimisation - it means the ACK can function akin to a TCP acknowledgement, but can also operate as a NACK to ask for retransmission of specific packets that have been lost. This is particularly convenient for multiplexing if a packet is lost - valid streams can be ACKd and delivered while still requesting retransmission for the lost data.

Congestion Control

With the information in the ACK packets, each side of the connection has the ability to detect potential congestion on the path between them. In general, packet loss is the best sign of congestion as the two most common causes of packets being dropped (in a UDP scenario) are due to more packets arriving at a point than can be processed. If the receiving host cannot pull packets of the UDP buffers fast enough the kernel will likely just throw incoming data away. On the network, if a link is too busy the network hardware may drop packets. This can even happen if the network looks like it is about to become busy, or if systems like RED are enabled.

Unlike TCP, UDP has no native way of dealing with congestion. If a link between a client and server is congested, a TCP sender will back off and reduce the load until the connection stabilises. UDP has no way of detecting the failure though, so always sends at the full rate. This means that without other consideration, QUIC would overwhelm other protocols across a crowded link as the others backed off. It would take up an increasing amount of the available bandwidth, making it somewhat unpopular with network admins. Similarly, an overwhelmed receiver who was running out of memory would have no way to signal to the sender to reduce the flow.

QUIC offers two methods of congestion control, both designed to operate well on a primarily TCP network. These are signalled by another type of data frame called the Congestion Feedback Frame. Which type of congestion control is used is negotiated as part of connection establishment (it's actually a tag which is communicated during the crypto setup).

The most straightforward type is a port of the TCP cubic congestion mechanism, as used in Linux. In this case the congestion frame has a type byte (3 for congestion feedback), followed 2 bytes for the count of lost packets, and then 4 bytes for the size of the receive window. This receive window determines how much data the other side can have "in flight" - meaning sent, but unacked. This data is used in the congestion control algorithm just like in standard TCP backoff - if three lost packets are seen, the window sized is halved, so the sender backs off. If the connection is going well, the window size can be increased, speeding up the rate of send. In QUICs case, it can actually communicate these changes to the other side with this frame type, while in TCP its generally implied.

The alternative system QUIC also supports is a pacing based congestion control. This is quite a different approach - rather than responding to dropped packets by reducing the send rate after the fact, the system attempts to estimate the bandwidth available and then measure out packets so as not to overwhelm the available resources. In a well functioning connection there should be a pretty steady back and forth of data and acknowledgements - if there is a delay in acknowledging then it is taking longer for that data to get through, which indicates extra buffering. Alternatively, if increasing the rate of send doesn't result in a slowing of round trip, then there is extra available capacity, and sending can go faster.

QUIC can track the round trip time via ACK frames, but because each packet wont necessarily get an individual ACK it has a couple of other techniques. Firstly, it can track the time between different packets arriving. It uses this to estimate the available bandwidth (initially just by going with number of bytes sent / time). This is used to make a decision whether to increase the send rate, decrease it, or hold steady. It also provides an estimate on what state the connection is in - such as whether there is another flow it is competing with (which would result in more rapid changes to the congestion conditions).

Secondly, information is communicated explicitly in the congestion feedback frame, just as with the cubic congestion control implementation. In the pacing case the frame looks a little different: it has the type byte, a count of lost packets as before, but they followed followed by a series of sequence number and arrival time pairs for the last few received frames. This allows the other side to calculate a number of explicit round trip times at once, and update its bandwidth estimation more accurately.

Read More