Dependency Analysis

The paralegal-flow tool is responsible for analyzing the dependencies in a program and encoding them in a graph called the Program Dependence Graph (PDG). The flow analyzer integrates with cargo and rustc and can be invoked on the command line simply by running cargo paralegal-flow. In addition to the command line version there is also a programmatic way to invoke it via the SPDGGenCommand type that is as part of the paralegal-policy framework.

This guide explains how the PDG extraction works and how to configure the flow analyzer with command line flags. When using the programmatic interface you can use SPDGGen::get_command() to get access to the underling shell Command and pass additional arguments with Command::args().

The following overview table lists all command line options available, a short description and links to the section of the guide with more information. You can also always see the full list of options available when you run cargo paralegal-flow --help. The most common and useful options are highlighted in green.

Flag / Argument	Purpose	Further Reading
`--external-annotations <FILE>`	Provide a file with additional markers for third party code.	External Annotations
`-- <CARGO OPTIONS>`	Pass additional flags (like `--features`) to cargo. Must be the last passed option
`--result-path <PATH>`	A file to write the graph to.
`--include <crate-name>`	When specified, Paralegal only constructs PDG’s for these crates. Can be supplied multiple times.	Paralegal approximates functions if their containing crate is excluded from analysis.
`--strict`	Abort analysis if e.g. dynamic dispatch is detected.	‣

Advanced Options for debugging and development. You will likely not not need these.

Flag / Argument	Purpose	Further Reading
`--verbose`	More output (up to log level “info”)
`--debug`	Debugging output, intended for developers of `paralegal-flow`
`--debug-target <CTRL>`	The name of a controller for which to enable debug output.
`--dump <TYPE>`	Output analyzer intermediate representations	‣
`--abort-after-analysis`	Force rustc to abort compilation once the SPDG has been created.	Caching
`--target <NAME>`	Select the crate for which to run the analysis. You can also select this via cargo options passed after `--`.
`--no-adaptive-approximation`	Turns off the adaptive approximation which means all functions are included as PDGs	Adaptive Approximation

You may also pass any of these options as environment variables by upper-casing the name. Some of the environment variables use the PARALEGAL_ prefix to avoid confusion. Check cargo paralegal-flow --help for the environment variable names used.

Cross-Crate Analysis

Paralegal’s PDG spans across all functions (all code) reachable from the analysis entrypoint (analyze and Targeting). Paralegal aims to produce a precise PDG for the portion of this code that is policy relevant. For the rest of the reachable code it employs an approximation (see Type Signature Abstraction) that is cheaper and produces smaller PDGs, which reduces end-to-end runtime. In the following paragraph we explain the rules which govern whether a function is modeled as a precise PDG or approximated. If you want stronger assurances from Paralegal it is important to understand the portion of the rules that are influences by configuration, as they can be a source of unsoundness. Use the toggle for each rule for more information and examples.

Rules Paralegal applies automatically:

Paralegal will create a precise PDG for any function from which a marker is reachable.
Marked functions (except the entrypoint) are approximated by type.
Paralegal approximates functions from libraries where no source code is available.

User influenced rules:

Paralegal approximates functions if their containing crate is excluded from analysis.

Type Signature Abstraction

As mentioned in Cross-Crate Analysis, Paralegal uses an approximation to deal with missing source code and as an optimization. This approximation uses a function’s type signature. As an example consider the function modify_extract in the following example

// approximated
fn modify_extract(
	read: &T, written: &mut Q
) -> R;

fn main() {
	let src1 = make_t();
	let mut src2 = make_q();
	let result = modify_extract(
		&src1, &mut src2
	);
	read_result(result);
	read_src2(src_2);
	read_src1(src_1);
}

Which would be represented in the PDG roughly as

graph TD
	make_t --src1--> read_src1
  make_t --src1--> read --> modify_extract["modify_extract[@return]"] --> read_result
	make_q --src2--> written --> modify_extract
	
	written --> w2
	read --> w2
	w2["written[@return]"] --> read_src2