User Tools

Site Tools


pub:projects:hang_survey

Hang Survey: A comprehensive study on software hang

The Problems

Software dependability is crucial to server and user applications, especially mission-critical ones. Unfortunately, software hang, a phenomenon of unresponsiveness, is still a major threat to software dependability. This kind of bug exists in many commodity software systems such as web browsers, database servers and office applications.

Most previous software bug studies focused on specific systems such as operating system errors and network service bugs. But none of them focused on software hang bugs.

Approaches

In this project, we present the first comprehensive study on real-world software hang bugs. We study the reports of hang-related bugs from four popular open source applications: MySQL, PostGre, Apache HTTPD server and Firefox, which are widely used in three-tier browser-server architecture. In total, we collect 307 confirmed hang bugs and analyze the characteristics of 233 bugs with known root causes (categorized as Certain). We also examine the rest 74 bugs (categorized as Uncertain) to study the obstacles to fix these bugs.

We classify bugs in Certain category into nine sub-category: Design, Infinite Loop, Concurrency, Configuration, Inefficient Algorithm, User Operation Error, Plug In, Others. The whole bug list is http://ppi.fudan.edu.cn/system/projects/hang survey/My View - LimeSurvey bug tracker.mht.

Observations

Our study makes the following observations:

  • Applications not only heavily suffer from well-known hang causes such as deadlock, data race and infinite loop, but also from design errors and execution environments.
  • Uncertainty of the root cause of a hang bug is the main obstacle to fix the bug, while concise test cases for bug reproduction could significantly accelerate the progress of bug fixes.
  • The fix time span study reveals that environment related bugs, infinite loop bugs and concurrency bugs survive much longer than other types of bugs.
  • To mitigate hang-related bugs, operators should check the protocol consistency before applying software update, check the runtime resource before running an application and simulate the runtime environment before deploying the application.

Summary

Software hang is a severe threat to software dependability. In this project, we present a comprehensive study on the characteristics of hang-related software bugs from four popular open-source applications. Our study presents nine categories of bugs that are major causes of software hang. Design, Environment, Infinite Loop and Concurrency are four main contributors of software hang. The bug fix study shows that a well-formed bug report is a key to accelerate hang bug fix progress. We also provide several observations and suggestions on how to improve hang bug fix progress.

pub/projects/hang_survey.txt · Last modified: 2012/01/06 12:25 (external edit)