KoTaP: A Panel Dataset for Corporate Tax Avoidance, Performance, and Governance in Korea

KoTaP: A Panel Dataset for Corporate Tax Avoidance, Performance, and Governance in Korea
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This study introduces the Korean Tax Avoidance Panel (KoTaP), a long-term panel dataset of non-financial firms listed on KOSPI and KOSDAQ between 2011 and 2024. After excluding financial firms, firms with non-December fiscal year ends, capital impairment, and negative pre-tax income, the final dataset consists of 12,653 firm-year observations from 1,754 firms. KoTaP is designed to treat corporate tax avoidance as a predictor variable and link it to multiple domains, including earnings management (accrual- and activity-based), profitability (ROA, ROE, CFO, LOSS), stability (LEV, CUR, SIZE, PPE, AGE, INVREC), growth (GRW, MB, TQ), and governance (BIG4, FORN, OWN). Tax avoidance itself is measured using complementary indicators cash effective tax rate (CETR), GAAP effective tax rate (GETR), and book-tax difference measures (TSTA, TSDA) with adjustments to ensure interpretability. A key strength of KoTaP is its balanced panel structure with standardized variables and its consistency with international literature on the distribution and correlation of core indicators. At the same time, it reflects distinctive institutional features of Korean firms, such as concentrated ownership, high foreign shareholding, and elevated liquidity ratios, providing both international comparability and contextual uniqueness. KoTaP enables applications in benchmarking econometric and deep learning models, external validity checks, and explainable AI analyses. It further supports policy evaluation, audit planning, and investment analysis, making it a critical open resource for accounting, finance, and interdisciplinary research.


💡 Research Summary

**
The paper presents the Korean Tax Avoidance Panel (KoTaP), a newly constructed, publicly available long‑term panel dataset covering non‑financial firms listed on Korea’s KOSPI and KOSDAQ exchanges from 2011 through 2024. After applying strict screening criteria—excluding financial institutions, firms with fiscal years that do not end in December, firms experiencing capital impairment, and firms with negative pre‑tax earnings—the final sample consists of 1,754 firms and 12,653 firm‑year observations, forming a balanced panel.

KoTaP’s core contribution is the systematic measurement of corporate tax avoidance. Four complementary indicators are provided: (i) Cash Effective Tax Rate (CETR), calculated as cash taxes paid divided by pre‑tax income; (ii) GAAP Effective Tax Rate (GETR), total tax expense divided by pre‑tax income; (iii) Total Book‑Tax Difference (TSTA), an accrual‑based measure of the gap between accounting income and taxable income; and (iv) Discretionary Book‑Tax Difference (TSDA), which isolates the portion of the gap attributable to discretionary accruals. To capture both short‑run volatility and longer‑run strategic behavior, three‑year and five‑year rolling averages of CETR and GETR (A_CETR3, A_GETR5, etc.) are also supplied.

Beyond tax avoidance, KoTaP integrates roughly twenty additional variables spanning four major dimensions: profitability (ROA, ROE, operating cash flow scaled by assets, and a loss dummy), stability (leverage, current ratio, firm size, PPE intensity, firm age, and inventory‑receivables ratio), growth (sales growth, market‑to‑book ratio, Tobin’s Q), and market valuation/governance (KOSPI listing dummy, BIG4 audit dummy, foreign ownership share, and largest shareholder ownership share). These variables are derived from publicly disclosed financial statements and are carefully cleaned, standardized, and adjusted for outliers and missing values.

The authors detail the data‑construction workflow: (1) defining sampling rules and gathering raw observations (approximately 25,000); (2) computing the tax‑avoidance metrics and the ancillary financial and governance variables; (3) performing rigorous preprocessing to eliminate missing entries and extreme outliers, resulting in the final analytic sample. Variable definitions, calculation formulas, and data‑source references are exhaustively documented, ensuring transparency and reproducibility.

Descriptive analyses confirm that KoTaP aligns with international literature regarding the distribution and correlation of core indicators while also reflecting distinctive Korean institutional features. For instance, CETR and GETR exhibit a strong positive correlation, whereas TSTA and TSDA capture distinct dimensions of book‑tax mismatches. Korean firms display higher average liquidity ratios and a pronounced presence of foreign shareholders, mirroring the country’s concentrated ownership structures (chaebols) combined with significant foreign investment.

The paper outlines a broad spectrum of potential applications. Researchers can employ traditional panel econometric techniques (fixed‑effects, random‑effects, difference‑in‑differences) to test causal links between tax avoidance and firm performance, risk, or governance outcomes. Machine‑learning approaches—such as LASSO, Ridge, random forests, gradient boosting, and deep neural networks—can exploit the multidimensional nature of the data to uncover nonlinear effects, interaction terms, and threshold dynamics. Explainable AI tools (SHAP, LIME) enable the visualization of variable importance, facilitating policy simulations and audit‑risk assessments. Moreover, the dataset allows for cross‑validation of findings across multiple tax‑avoidance measures and outcome variables, addressing a common criticism in the tax‑avoidance literature concerning reliance on single indicators.

Finally, by making KoTaP openly accessible, the authors promote academic reproducibility, policy transparency, and interdisciplinary collaboration. The dataset serves as a foundation for comparative studies between Korea and other jurisdictions, for the development of AI‑driven predictive models of tax‑avoidance behavior, and for evaluating the impact of institutional reforms such as IFRS adoption, tax‑code changes, and enhancements in audit quality. In sum, KoTaP fills a critical gap in the empirical tax‑avoidance literature, offering a rich, high‑quality resource that bridges accounting, finance, economics, and data‑science communities.


Comments & Academic Discussion

Loading comments...

Leave a Comment