Reducing data consumption for the end customer is a paramount task when building a mobile app, and it was one of our biggest challenges with Current Caller ID. Since Current feeds information from social networks, news and weather, it naturally delivers a lot of data to clients. Our solution for reducing data was twofold: The first was using Apache Thrift as an API interface to improve payload size, even when gzipping the payloads. The second was ensuring the server could identify and deliver new items incrementally.
Here’s how it all came together…
An intro to Current Caller ID
WhitePages has provided standard caller ID on the Android platform for several years, but caller ID is only useful for calls you receive from folks you don’t know. Looking at our data we saw that only about 40% of the calls a person made or received were from unknown callers. We saw a huge opportunity to improve the value we deliver to our customers by including more information. Our goal was to inform you about who you communicate with, and how you communicate with them.
Broadly, we grouped the features into:
- Social network activity from your contacts
- Location-specific news and alerts based on your contacts’ location
- Statistics about your communication patterns with your contacts
Feeding social updates, news and weather into our app required us to deliver lots of data to our clients, and we were concerned with how delivering this extra data across the wire would impact customers’ data plans.
Apache Thrift to the Rescue
Most of our existing APIs to clients were semi-RESTful APIs with JSON payloads when retrieving or sending 100 contact deltas at a time the numbers were poor. The average JSON payload was around 100-120k. Gzipped these payloads were around 12-15k, which is good, but we thought we could do better. That led us to look at other avenues for APIs.
Some teams at WhitePages had been exploring Thrift as a replacement for a legacy RPC mechanism we had for internal services. Its abstract service and structure definition, client generation for multiple languages and binary payload made it a natural choice for exploration in mobile client interfaces, too.
One thing we were concerned about at the time was the only other company we knew that exposed Thrift APIs publicly was Evernote. But, we decided to trudge on anyway, and found that over-the-wire (for our data types), Thrift with the standard binary serializer gave us a >=50% savings on average uncompressed (40-50k payloads for 100 contacts, versus 100-120k). Gzipped the savings was less significant, but on average we still saw ~20-30% savings (between 10-12k payloads versus 12-15k for JSON). Also, the CPU cost to process Thrift binary payloads vs. JSON payloads was better in many cases on Android devices, which reduces battery consumption.
Thrift also gave us more freedom to iterate on the service and structures during development with a reduced impact on client engineers, since the clients and the structures could be re-generated, and only the encapsulating models expressing those structures needed to be changed. Compared to the amount of time we’d spent in the past building client SDKs for our REST APIs, this process saved us several weeks of engineering effort for these new APIs.
Current’s interface definitions have been used to generate services and models in Ruby (for our services and clients used unit/integration testing of those services), Java (for our Android application in the market), and Objective-C (for some iOS prototypes we’ve built).
There are a few downsides to our choice of Thrift and Ruby. The first is it doesn’t support inheritance directly, so to extend existing structures you have to build composites, which in some cases creates a deep hierarchy. The second is because of how the Ruby runtime allocates objects, this compositing has a tendency to allocate a lot of objects, and at scale Ruby systems whose primary job is to fetch and transmit Thrift can spend a lot of CPU time in GC.
Only Delivering When Important Data has Changed
To keep the data usage down, we also wanted the client to only get records of data that have important changes to spark conversation. In our world examples of an important change would be:
Address and phone changes. “Congrats on your new home!”
Social profile updates. “Your job title changed, congrats on the promotion!”
New status and check-in data. “I was just at Disneyland too! The Matterhorn stands the test of time.”
News and weather changes about the contacts locale. “Wow, Lindsay Lohan, why are you calling me, and in the news too? At least it’s still sunny in California.”
In all cases, the important changes above should only matter for people we can actually resolve against people you communicate with. Because of this, we don’t fetch and deliver statuses and checkins for people who we haven’t tried to a contact, nor do we deliver news and weather for locales that you don’t have contacts associated with.
Statuses and check-ins are fairly transient and time-sensitive, so we schedule updates for this information on the server more frequently than profile changes. News and weather are time sensitive but don’t change frequently during the day, so we fetch that twice a day. Profile data changes happen less often, so depending on the social provider we fetch that anywhere from twice a day to every 5 days. Even though we fetch at these periods, we only update the client when something has actually changed.
Also, to reduce the cost of spinning up network connections, all of our APIs accept and return batches of data.
To help identify true changes, we rely on the Ruby implementation of Thrift, which includes a hash function that is a true checksum of the data elements of the entire structure. Some of the fields in our structure are not relevant to the change (like the last updated time), so we had to make some modifications for our dupe check code to ignore fields that we wanted to exclude, but it saved us the effort of having to implement our own mechanism from scratch.
When data comes in from the server, we compare it to existing data and only update items that have true differences. Each of these updated items then gets an updated timestamp, which we store along with the contact.
Clients are aware of the latest timestamp of data that they’ve got locally, and submit that to us as part of their request for updates, which we use to filter out data older than that timestamp before returning. The client then records the latest timestamp for the next periodic request.
The broad features of Current required us to design several new systems, and ensure those services scaled to support a much larger customer base. Nine months after launch, we’ve proven that our system can scale up effectively.