Accuracy-related penalties are supposed to promote voluntary compliance. Congress has directed the IRS to develop better information concerning the effects of penalties on voluntary compliance, and it is the IRS’s official policy to recommend changes when the Internal Revenue Code (IRC) or penalty administration does not effectively do so. The objective of this study was to estimate the effect of accuracy-related penalties on Schedule C filers (i.e., sole proprietors) whose examinations were closed in 2007. TAS compared their subsequent compliance to a group of otherwise similarly situated “matched pairs” of taxpayers who were not penalized. TAS used Discriminant Function (or “DIF”) scores — an IRS estimate of the likelihood that an audit of the taxpayer’s return would produce an adjustment — as a proxy for a taxpayer’s subsequent compliance.
While all groups of Schedule C filers who were subject to an examination assessment improved their reporting compliance (as measured by reductions in their DIF scores), those subject to an accuracy-related penalty had no better subsequent reporting compliance than those who were not. Thus, accuracy-related penalties did not appear to improve reporting compliance among the Schedule C filers who were subject to them. Further, penalized taxpayers who were also subject to a default assessment or who appealed their assessment had smaller reductions in DIF scores, suggesting lower reporting compliance five years later as compared to similarly situated taxpayers who were not penalized. n2 Similarly, those whose penalty was abated had smaller reductions in DIF scores, suggesting lower reporting compliance five years later as compared to taxpayers whose penalty was not abated.
n2 Except as otherwise indicated, all differences discussed in this report are statistically significant (with 95 percent confidence). We note, however, that the DIF is an approximate measure of reporting compliance, and small differences, although statistically significant, may not indicate a real difference in reporting compliance
Prior research suggests that a taxpayer’s perception of the fairness of the tax law, the IRS and the government drive voluntary compliance decisions, and the findings of this study are consistent with that research. Taxpayers subject to default assessments may be more likely to feel the penalty assessment process was unfair, which may have caused lower levels of future compliance. Similarly, those who appeal may be more likely to feel that the actual result was unfair, which may have caused lower levels of future compliance. Finally, those subject to a penalty assessment that is later abated may also feel that the IRS initially sought to penalize them unfairly, potentially causing lower levels of future compliance.
These findings have a number of policy implications. First, the IRS should revise its procedures to ensure that it does not propose a penalty before exhausting efforts to communicate with a taxpayer to determine whether a penalty actually applies. By design, automated procedures — those that presume a penalty applies unless a taxpayer explains and documents why it does not — are likely to generate more default assessments and penalty abatements than other examination methods. As taxpayers who were penalized after default assessments or whose penalties were abated had smaller reductions in DIF scores, suggesting lower levels of voluntary compliance after five years than those who were not, these automated procedures may be inconsistent with the IRS’s goal of promoting voluntary compliance.
Second, the IRS’s Appeals function should consider doing more to objectively evaluate and then explain its determinations, particularly when it sustains a penalty. As taxpayers who were penalized after an appeal had smaller reductions in DIF scores, suggesting lower levels of compliance after five years than those who were not penalized, it is possible that they did not perceive Appeals as fairly evaluating whether the penalty should apply. Finally, in the case of penalties that taxpayers generally regard as unfair (e.g., where a reasonable cause exception does not apply, or where it may be interpreted so narrowly as to, in effect, create a strict liability penalty), the IRS should consider applying a broader reasonable cause exception (or work with the Treasury Department to propose one) that is simple, fair, transparent, and easy to administer.I though readers might also like the following selected excerpts from the Report on methodology (footnotes omitted):
The single largest component of the tax gap — the gap between the amount of tax due and the amount voluntarily and timely paid — is underreporting of business income by individuals. Thus, this study focuses on the effect of accuracy-related penalties, which apply to underreporting, on Schedule C filers.
* * * *
TAS identified sole proprietors subject to audit adjustments in 2007 and used changes in their “DIF” scores as a proxy for changes in their reporting compliance.
TAS sought to determine how accuracy-related penalty assessments affect subsequent reporting compliance by sole proprietors (i.e., those who file Form 1040, U.S. Individual Income Tax Return, with a Schedule C, Profit or Loss from Business). TAS focused on those subject to an examination adjustment in 2007 for tax year (TY) 2003 or later.39 TAS gauged reporting compliance using the IRS’s computer algorithms (called a Discriminant Function or “DIF” score) that estimate the likelihood that an audit of the taxpayer’s return would produce an adjustment (i.e., a higher DIF generally corresponds to lower reporting compliance).
Because DIF scores are computed separately for taxpayers in each “exam activity code” (EAC) each year, the scores of those in one EAC are not comparable to the scores of those in another EAC or to DIF scores computed for different tax years. To compare taxpayers in different EACs and for different years, TAS scaled the DIF scores. For each year, TAS first sorted all of the taxpayers in each EAC by DIF, and then assigned the taxpayers a scaled DIF score based on the decile into which they fell. For example, TAS assigned those in the first decile a scaled DIF score of 1 and those in the 10th decile a scaled DIF score of 10. TAS used changes in the taxpayer’s scaled DIF score as a proxy for changes in reporting compliance.
TAS identified matched pairs of similarly situated taxpayers — those subject to a penalty and those not subject to a penalty.
If the IRS consistently assessed an accuracy-related penalty against all similarly situated taxpayers, then it would be difficult to determine whether differences in future compliance were due to differences in the taxpayers, the audit, or the penalty itself. In technical terms, the analysis would suffer from “selection bias.” To minimize this problem, TAS sought to analyze matched pairs of similarly situated taxpayers that were different in only one respect: One was assessed a 20 percent accuracy-related penalty and the other was not.42 Otherwise, the paired taxpayers were similar. They were in the same EAC (e.g., had similar levels of positive income and receipts), subject to the same type of examination (e.g., a field examination, office examination, or correspondence examination), and subject to an adjustment of a similar dollar amount (i.e., in the same quartile).