Scalable and Practical App Digging Engine

What is app digging?

App digging refers to the process of capturing and analyzing runtime states of a mobile app. Examples of runtime states include

  • Data such as news articles, recipes, deals, etc. that a mobile app shows to its users when it is run (by potentially downloading it from back-end servers). The data is useful for enabling search on app-content (in a general search engine or in app-store search engine) , for displaying contextual ads within apps, for verifying if kids' apps are displaying age-inappropriate contents (and thereby violating COPPA regulations), etc. 
  • How an app shows data is being displayed to users: are buttons too small? is there too much text? This information is useful for various accessibility analysis such as determining whether an app is suitable to be used in a vehicle.
  •  How an app uses third party controls; e.g., whether an app uses third-party ad-controls in  fraudulent ways or use Facebook sdks incorrectly that allows various security attacks.
  • What information an app sends to back-end servers. The information is useful to verify various privacy properties of the app.
  • How an app performs when run under various external conditions; e.g., whether an app page fails to load or whether it crashes if the network is too slow. 

Many of the tasks mentioned above are becoming increasingly important for app stores (e.g., for checking apps' runtime security and privacy properties, for capturing and indexing data inside apps for better app search), for app developers (e.g.,  for checking app's runtime performance under various conditions) and for third-party sdk providers (e.g., to check if developers are using them correctly).

What is SPADE?

We develop SPADE, a collection of tools for quickly and automatically analyzing runtime states of a large collection of mobile apps. SPADE uses two key techniques. First, it uses binary instrumentation to automatically insert custom code into app binary to capture its runtime state. Second, it executes an instrumented app in a phone/tablet emulator and automatically navigates through various app pages by emulating user interactions. SPADE employs a number of novel optimizations to increase coverage (i.e., fraction of total app pages that are explored) and speed (i.e., number of unique app pages explored) of its exploration.

An overview of the project, with scenarios and technical challenges, can be found here.


Contact: Suman Nath