Research Overview

DAG-Constrained Coordinate Descent for Causal Discovery

Research Background

Traditional statistical analysis and machine learning primarily focus on revealing correlations among variables. However, for understanding complex socio-economic or biological phenomena, as well as for policy design and decision-making support, it is essential to identify the causal relationships underlying the data.

Within this context, causal inference aims to estimate intervention and counterfactual effects, while causal discovery seeks to estimate causal structures (directed acyclic graphs; DAGs) directly from observational data.

Recently, approaches that formulate causal discovery as a continuous optimization problem—notably exemplified by NOTEARS—have attracted much attention. These methods relax the combinatorial search of DAGs into a differentiable optimization framework. However, most existing approaches employ an \(L_1\)-norm penalty to promote sparsity, which has been shown to lack consistency over the Markov equivalence class of the true graph. Conversely, the \(L_0\)-norm penalty offers theoretical consistency but is non-continuous and thus difficult to optimize via gradient-based methods.

Basic DAG Structure Complex Causal Network
Figure 1. Examples of Directed Acyclic Graphs (DAGs) representing underlying causal mechanisms.

Core Methodology

1. Model Assumptions & A-formulation

We consider a linear Structural Equation Model (SEM) with independent noise:
$$X = B^T X + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \Omega)$$
where $B$ is the weighted adjacency matrix and $\Omega$ is a diagonal noise covariance matrix. We introduce the transformation:
$$A := (I - B)\Omega^{-1/2}$$
Under this A-formulation, the log-likelihood optimization is converted into finding a sparse, invertible matrix $A$ that respects the DAG constraint.

2. Coordinate Descent Algorithm

Our method optimizes the structure by updating one coordinate (edge) at a time. Unlike gradient-based methods that move along the steepest descent in continuous space, Coordinate Descent (CD) explores the discrete structure space by making locally optimal, discrete moves that are guaranteed to stay within the DAG manifold.

In each Epoch, the algorithm performs a random permutation of all unordered node pairs $\{i, j\}$. For each pair, we evaluate two potential directed edges ($i \to j$ and $j \to i$). The update is only accepted if it satisfies the acyclicity constraint and provides a larger objective decrease than the alternative direction.

CD Search Space
Figure 2. Conceptual visualization of the Coordinate Descent search process on the DAG manifold.

Epoch-based Traversal Strategy

By iterating through epochs and diagonal updates, the algorithm ensures that every potential causal link is revisited under the context of the currently estimated graph. This iterative refinement allows the model to escape local minima that often trap naive greedy search algorithms.

Epoch-based CD Flow
Figure 3. Logic flow of the Epoch-based Coordinate Descent (A-formulation).

Analytical Closed-Form Solution

Fixing all other entries, the optimal update $\delta^*$ is computed analytically:
$$\delta^* = \frac{2(b - \alpha)}{-(c + \alpha b) - \sqrt{(c - \alpha b)^2 + 4\alpha^2 c}}$$
where $c = S_{ii}$, $b = (SA)_{ij}$, and $\alpha = (A^{-1})_{ji}$. This ensures each step is computationally exact and numerically stable, avoiding the need for expensive line-search or inner-loop iterations.

Experimental Results

IndexCD (Ours)CD_L0 (Ours)GOLEM_EVGOLEM_NV
10.790.780.600.50
20.810.640.560.57
31.000.881.001.00
40.740.380.020.20
50.610.450.010.03
60.840.840.350.23
70.520.360.000.00
80.810.560.060.00
90.600.680.080.00

The CD method shows superior performance in recovering true causal links, particularly in high-sparsity scenarios where GOLEM struggles with soft acyclicity penalties.

Application Example

The proposed method can be applied to various domains requiring causal structure discovery from observational data, including economics, biology, and social sciences.

Application Example
Figure 4. Example application of causal discovery in real-world scenarios.

研究背景

従来の統計解析や機械学習は、主に変数間の「相関関係」を捉えることに重点を置いてきました。しかし、複雑な社会経済現象の理解や意思決定支援においては、データに潜む「因果関係」を特定することが不可欠です。本研究では、観測データから因果構造(DAG)を直接推定する「因果探索」に焦点を当てます。

近年、NOTEARSに代表される連続最適化による因果探索が注目されていますが、既存手法の多くは$L_1$ノルムペナルティによるバイアスや、マルコフ同値類上での一貫性の欠如という課題を抱えています。本研究では、$L_0$型スパース性を維持しつつ、厳密なDAG制約下で最適化を行う新しい座標降下法を提案します。

DAG構造例1 DAG構造例2
図1. 因果メカニズムを表現する有向非巡回グラフ(DAG)の例。

研究内容:アルゴリズムの詳細

座標降下法(Coordinate Descent)の戦略

提案アルゴリズムは、全変数を同時に更新するのではなく、各エッジを逐次的に更新します。各エッジの更新は、他のエッジの状態を固定した上での一次元最適化問題として解かれます。

エポックベースの更新: 毎エポックごとにノード対の順序をランダムに入れ替え(Permutation)、因果方向の有無と向きを判定します。これにより、局所解に陥るリスクを低減し、より真の構造に近いグラフへと収束させることが可能になります。

アルゴリズムフロー
図3. エポックベース座標降下法の処理フロー(A定式化)。

実験結果

提案手法は、特にスパース性が高いデータセットにおいて、既存のGOLEM等の手法を大きく上回る復元精度を達成しました。座標降下法による離散的な探索が、連続最適化におけるバイアスを効果的に回避していることが示唆されます。

応用例

提案手法は、観測データから因果構造を発見する必要がある様々な領域に適用可能です。経済学、生物学、社会科学などの分野で、真の因果メカニズムを明らかにすることができます。

応用例
図4. 実世界のシナリオにおける因果探索の応用例。