nil}2019-01-23T00:00:00-09:00{"modified_time"=>nil}" />

Blog

Interpreting Apple's Transparency Report
Andrew McKnight – 23 January 2019

Earlier this month Apple released their transparency report for January 1 to June 30 of 2018, detailing requests for various types of information from governments and private third parties. They publish the information in a nice web app presentation, PDFs and CSVs… perfect for some number crunching.

Update: removed a comment about Germany being an outlier in the 4th chart, Requested Device Amount Distribution; I wrote that comment about a different chart and swapped them out without removing the comment!. Also updated FastMath to 2.2.1, as I released a few fixes used while publishing, and a few wording improvements.

Crunching the Numbers

I focused on requests targeting devices, financial identifiers and account info. There are other types of requests reported, but the formats of the CSVs are different, so I stuck to these for simplicity’s sake. They’re also the main ones used on Apple’s website for detailed country reports.

For each request type, I’m interested in the number of requests received from each country, the amounts of items requested and the amount of requests honored by Apple. Apple reports those last two as ratios and percentages, respectively. I converted requested item ratios to absolute units to show how many devices or accounts are being sought, and while I use percentage of honored requests as a proxy for request quality, I also convert those to absolute units for side-by-side comparison in each chart showing amounts of requests.

I’m most interested in the number of items requested, as that is the true impact on Apple’s users–and the world’s citizens. I see the number of requests–and hence the ratio of items per request–as the efficiency of the requesting country. Finally, the percentage of fulfilled requests I use as a proxy for the quality of the requests submitted.

Device Requests

Here are the top 10 countries, by number of devices across all requests:


For the same 10 countries, here’s the breakdown of the amount of requests made, along with the amount of requests that Apple honored:

Poland and South Korea submitted a very small amount of device requests, each with a large batch of target devices. Very efficient!
Poland and South Korea submitted a very small amount of device requests, each with a large batch of target devices. Very efficient!

The number of requests includes updates or resubmissions of earlier requests either rejected or withdrawn. Only a subset of the difference between the number of requests submitted and those honored by Apple may represent some set of devices that were never disclosed.

Here’s how the amount of devices those countries request has trended over time:

Poland requested over half a million devices in H2 of 2014! I had to go back and double check the original reports just to make sure. And, they only submitted 30 requests that period.
Poland requested over half a million devices in H2 of 2014! I had to go back and double check the original reports just to make sure. And, they only submitted 30 requests that period.

To digest big datasets, I like to look at the histogram of the sample values’ z-scores. This shows the kurtosis of a population, the “shape” of the data. At a glance, questions about very large populations can be easily answered. Then you can strategize for next steps: is it tightly grouped around the mean, uniformly random, or on a bell curve? Is there a long tail worth considering?

This is the distribution of all countries’ requested device amounts, not just the top 10 countries:

Mean: ~3.3K devices; standard deviation: ~8.1K devices. Most of the request distributions look like this, meaning most of the countries behave similarly in terms of their requests, with some extreme outliers. Normalizing for other demographic information like GDP or population might change this picture significantly.
Mean: ~3.3K devices; standard deviation: ~8.1K devices. Most of the request distributions look like this, meaning most of the countries behave similarly in terms of their requests, with some extreme outliers. Normalizing for other demographic information like GDP or population might change this picture significantly.

Again for those top 10 countries by number of devices requested, let’s look at the percentage of requests that Apple honored. The amount of requests is shown earlier, in the second chart, but the percentage better demonstrates the quality of a country’s requests. Because this is percentage of requests, and we don’t know details about how many devices were in a given honored or rejected request, it’s not possible to deduce how many devices were ultimately uncovered from these numbers. Note that the Y axis starts at 75%, not 0%:


Here’s how the quality of those countries’ requests have trended over time:

The countries generating the majority of the device requests are tending towards higher quality.
The countries generating the majority of the device requests are tending towards higher quality.

Finally, here’s how all the countries break down in terms of request quality:

Mean: 72.8%; standard deviation: 20%.
Mean: 72.8%; standard deviation: 20%.

I got curious about the outliers on that chart: the two countries at -3 standard deviations are Serbia (who requested 1 device in their only request, which was denied) and Mexico (3 devices in 2 requests, both denied). OK, those are pretty non-impactful numbers… let’s look at the lowest acceptance rates of device requests:


And to get an idea of the impact and efficiency of these countries:


***

Whew, you made it through the first section! I’ll provide the same first several charts for the other types of requests considered–financial IDs and accounts–with no more commentary. If you’d prefer, you may skip to the end, where I talk about how I crunched the numbers.

Financial Identifier Requests


***

***

***
Mean: 817.2 financial IDs; standard deviation: 2447.7 financial IDs.
Mean: 817.2 financial IDs; standard deviation: 2447.7 financial IDs.
***

***

***
Mean: 77.2%; standard deviation: 26.5%.
Mean: 77.2%; standard deviation: 26.5%.

Account Requests


***

***

***
923.7 accounts; standard deviation: 3786.6 accounts.
923.7 accounts; standard deviation: 3786.6 accounts.

Need a break from the charts? How about this interesting TIL I found while writing this post: the 68-95-99.7 rule (aka empirical rule), a mnemonic to remember the percentages of a normally distributed population within 1, 2 and 3 standard deviations from the mean, respectively.


***

***
Mean: 64.1%; standard deviation: 34.5%.
Mean: 64.1%; standard deviation: 34.5%.

Colophon

To compile these numbers from the CSV, I used my FastMath and Pippin Swift libraries, adding some new functions along the way (they’re now at versions 2.2.1 and 12.1.0, respectively). The code specific to working with Apple’s CSVs lives in an Xcode test suite. (I discovered the excellent cocoapods-playgrounds and used it to set up an Xcode Playground initially, but wound up moving to a test suite to get better IDE supports working on the code.)

I used Numbers.app to create the final charts from my program’s output. I could not find a Swift drop-in graphing library that worked flawlessly for my needs; I evaluated Charts, SwiftCharts and core-plot (I also found PNChart after writing, which looks quite nice).

Coming Up Next

As part of researching for this post, I discovered a few new ways to visualize data I’d like to try, including box plots of 5 number summaries of various distributions, even plotted over time. I’d like to move it from an Xcode unit test suite to a command line application, linking FastMath and Pippin statically instead of the venerated use_frameworks! in my Podfile. It’d be nice to have a [web] app that dynamically renders all these charts, so I don’t have to use Pages.app any more! And finally, I’d like to see how these data correspond to others, like countries’ device sales, GDP, population, law enforcement and/or defense spending, or incarceration rate.

Perhaps now that the groundwork is laid, I’ll get to some of that for the next Transparency report around June 😎


🙏🏻 Enjoy the post? Please help support more like it by buying me a cup of coffee!.

💡 Suggestions, questions, comments? Please submit a PR!.