Soundness of the Graphs

Paralegal’s dependency graphs are deigned to soundly reflect the dependency relationships of safe Rust. The analysis is a modified version of Flowistry (Crichton et.al. from PLDI 2022) and a formal proof of soundness for Flowistry is provided in their paper. The following paragraphs describe the Semantic Program Dependence Graph (SPDG), created with the Flowistry analysis, and its soundness properties. The descriptions are simplified for terseness with no loss in applicability.

Definitions

Soundness: Both Flowistry and Paralegal statically model dependency relationships of data in a Rust program. Such a dependency relation is sound if each “true” dependency between any two data $a$ and $b$ in the program is reflected in the model. A dependency is “true” if there exists at least one execution trace where a change in the value of only $a$ would entail a change in the value of $b$.

Completeness: Analogously, the dependency model is complete if each “true” independence is reflected in the model.

SPDG: is a program dependence graph (PDG) that models the control- and data-flow of a program as nodes, representing program locations, and edges, representing dependencies, respectively labeled “control” or “data”. In the SPDG nodes additionally carry a set of semantically meaningful markers which the policies reference.

A snippet of the delete controller for the Plume web application

Paralegal’s extracted SPDG for the snippet above.

The nodes of Paralegal’s SPDG are call paths that uniquely identify a function call site relative to the entry point of the analysis. For visual aid a short example is shown on the left.

The SPDG captures both data and control flow and as a result has two types of edges. $n_1 \xrightarrow{\mathsf{data}} n_2$ describes a direct data dependency between the nodes $n_1$ and $n_2$ (blue in the example), meaning there is a value produced by $n_1$, some part of which which is received by $n_2$ with no other function call $n_3$ modifying it in between. Control flow edges, $n_1 \xrightarrow{\mathsf{control}} n_2$ (green in the example), mean that a value created by $n_1$ directly determines whether $n_2$ executes.

There are two more node types in the SPDG, a singleton “return” type node that represents what is returned from the function serving as entry point to the analysis and a “argument” type nodes that represent inputs to the entry point function (if any). For the purposes of soundness, we can disregard these two node types, as they act the same as call site nodes. Currently all non-function call operations are represented in the SPDG only as their data and control flow influence, i.e., as edges.

Soundness

Paralegal’s SPDG construction is underpinned by the same reasoning employed in Flowistry. Whereas Flowistry constructs a matrix that associates each program location with all of its transitive dependencies however, Paralegal constructs a PDG that associates only locations of function calls with their direct dependencies. We have not yet formally proven that the same Soundness guarantees for safe Rust that was proven for Flowistry transfer to Paralegals PDG construction. As such this section discusses the soundness guarantees that Paralegal is engineered to achieve and we defer their formal proof to future work

With respect to a value created by a node $n_s$ the SPDG:

If a marked function call $n_s$ can influence an input to a marked function call $n_t$, then there exists a path $n_s \xrightarrow{\mathsf{data}|\mathsf{control}}^* n_t$ in the SPDG.
If a marked function call $n_s$ can influence whether the marked function call $n_t$ executes, then there exists a path of the shape $n_s \xrightarrow{\mathsf{data}|\mathsf{control}}^* n_i \xrightarrow{\mathsf{control}} n_t$, with $n_i$ being some arbitrary node. Informally, $n_s$ can influence the execution of $n_t$ if there exists some path between them with the last link being of type $\mathsf{control}$.

Soundness is only guaranteed for effects expressed in safe Rust or captured by type signatures and only within the program. In particular the following compromise soundness:

Soundness Threat: Interior Mutability

Data structures such as RefCell and Mutex allow mutation through immutable references. Those “true” dependencies are not captured.

let rc = RefCell::new(src);
let bm = rc.borrow_mut();
*bm = unseen;
read(rc.borrow());

In this small example the “true” influence from the unseen value on read() is lost (dotted line).

graph TD
  src -->|"new()"| rc
	rc -->|"borrow_mut()"| bm
	unseen --> bm
	src -->|"borrow()"| read["read()"]
	bm -..-> read

Soundness of the Graphs

Definitions

Soundness

Soundness Threat: Interior Mutability

Soundness Threat: Lifetime Coercion